Document 10821069

advertisement
Characterizing cell-cycle as a global regulator of stochastic
transcription and noisy gene expression in S. cerevisiae
by
Katie J. Quinn
B.Engineering (Chemical & Biological) & B.Science (Molecular Biology)
University of Queensland, 2008
Submitted to the Department of Chemical Engineering
in partial fulfillment of the requirements for the degree of
Doctor of Philosophy in Chemical Engineering
MASSACHUSETT$ MN1TTUTE.
OF TECHNOLOGY
at the
Massachusetts Institute of Technology
JUN 3 0 2014
June 2014
LIBRARIES
@ 2014 Massachusetts Institute of Technology. All rights reserved.
Signature of Author:
Signature redacted
Department of Chemical Engineering
May 20, 2014
Signature redacted
Certified by:
Narendra Maheshri
Assistant Professor of Chemical Engineering
Thesis cupervisor
Accepted by:
Signature redacted
Vatlick S. Doyle
Professor of Chemical Engineering
Chairman, Committee for Graduate Students
Characterizing cell-cycle as a global regulator of stochastic transcription and noisy gene
expression in S cerevisiae
by
Katie J. Quinn
Submitted to the Department of Chemical Engineering on May 20, 2014
in partial fulfillment of the requirements for the degree of
Doctor of Philosophy in Chemical Engineering
Abstract
Even in the same environment, genetically identical cells can exhibit remarkable
variability, or noise, in gene expression. This expression noise impacts the function of gene
regulatory networks, depending on its origins. Hence, a prerequisite for understanding or
designing gene regulatory networks is characterizing the origins and statistics of the noise.
Variability has been largely attributed to the inherently stochastic nature of transcription.
Expression statistics from multiple organisms are consistent with an influential model of
"bursty" expression, where promoters are generally inactive but infrequently produce
multiple mRNA. But fluctuations in the cell environment can also contribute, leaving the
origins of noise unclear.
We sought to determine the origins of noise in gene expression from the synthetic tetO
promoter in S cerevisiae. We use single-molecule mRNA FISH to quantify nuclear and
cytoplasmic mRNA in a population expression distribution, and models of stochastic mRNA
production and degradation to infer underlying transcriptional dynamics. Rather than
transcriptional bursting, we find that noise is driven by large differences in transcriptional
activity between the G1 and S/G2/M stage of the cell cycle. Furthermore, we quantitatively
characterize these dynamics of transcription by measuring expression in cells arrested at the
G1/S and G2/M transition. Promoters activate in S/G2 with probability determined by
activator level. mRNA statistics from an active promoter with a single operator are Poisson;
expression with multiple operators is more variable. Promoters appear to inactivate at the
M/G1 transition, with lower activator levels leading to increased probability of inactivation.
Thus below a certain activator threshold, all cells are inactive in G1. mRNA processing and
export introduces further variability. Similar analysis of the native, chromatin-regulated
PHO5 promoter yields the same results.
Hence cell-cycle driven transcription dynamics may be prevalent among regulated yeast
genes. The timing of S/G2 activation suggests DNA replication and chromatin maturation
may be linked to repressed transcription. Cell-cycle-linked fluctuations in expression are
likely to affect gene behavior in regulatory networks. This thesis advocates the importance
of cellular context in gene regulation and reveals a novel role of cell-cycle as a driver of
eukaryotic transcription, advancing our understanding of stochastic transcription and noise
in gene expression.
Thesis Supervisor: Narendra Maheshri
Title: Assistant Professor of Chemical Engineering
3
Acknowledgements
I first thank my advisor Narendra Maheshri for his unfailing support and enthusiasm,
boundless ideas and knowledge, and inspiring approach to science. On top of this, Narendra
is also a natural and generous teacher, making for a wonderful mentoring experience.
I thank my thesis committee members, Arup Chakraborty and Christopher Love, for
their involvement and insights, and all the ChemE staff and faculty for support during my
time here at MIT. My thanks also go to my undergraduate advisor, Lars Nielson, who
encouraged me to pursue a PhD in the USA and gave me my first taste of independent
research at the University of Queensland in Australia. The General John Monash Awards
provided generous financial support and welcomed me into a wonderful community of
Australians abroad.
I owe my labmates, T.L., Tek-Hyung, C.J., Bradley, Shawn & Nick, for teaching me in
the lab, for valuable feedback in group meetings, and for sharing in the everyday struggles
of a PhD student. I also thank my ChemE friends, especially my housemates at Speridakis,
for many good times in and out of Building 66. I whole-heartedly thank the past and present
members of the MIT Cycling Club for being a highlight of my time at MIT. The friendships
and shared experiences have been true pleasure, and I hope that's so for many more MIT
students to come.
To Adam, thank you for sharing and supporting me in every step of the past few years.
Finally, I thank my family, who helped me find my way to MIT in the first place. To
my brothers and sister, Simon, Jess and Andrew, thanks for the listening and counselling it's been a great help. And to my parents, Greg and Julie, thank you for teaching me to love
learning and for your selfless, endless support of all of my endeavors, even when they're on
the other side of the world.
4
Contents
A b stra ct .......................................................................................................................
3
A cknow ledgem ents ....................................................................................................
4
CHAPTER 1. Introduction...........................................................................................9
1.1 Genotype to phenotype: Regulation of transcriptional dynamics..........................9
1.2 Intrinsic expression noise from stochastic transcription dynamics.........................10
1.2.1 An introduction to stochastic transcription dynamics ...............................
10
1.2.2 A model for understanding stochastic transcription dynamics...................11
1.2.3 Conflicting evidence for cis and trans regulators' modes of controlling
transcription dynamics...............................................................................15
1.3 Extrinsic noise in gene expression .......................................................................
19
1.4 Consequences of noise from stochastic transcription dynamics..............................20
1.5 Thesis aim and sum m ary ......................................................................................
21
1.6 R eferen ces.............................................................................................................24
CHAPTER 2.
Mode of transcriptional regulation can qualitatively affect gene behavior in
positive feedback ................................................................................
32
2 .1 A b stract ...............................................................................................................
32
2.2 Introduction: Modes of regulating transcriptional bursting ...................................
33
2 .3 Th eo ry :.................................................................................................................3
6
2.3.1 Steady-state expression of frequency- or size- regulated stochastic
transcriptional bursting in feedback control..............................................36
2.3.2 The regime of bimodal expression .............................................................
38
2.3.3 Two modes of regulating burst size are equivalent in the bursting limit.......45
2 .4 R esults:.................................................................................................................4
5
6
2.4.1 Bimodal expression patterns associated with positive feedback loops are
enhanced with burst frequency regulation but reduced with burst size
regu la tio n .....................................................................................................
46
2.4.2 Mode of regulation affects mean expression..............................................
49
2 .5 D iscu ssion .............................................................................................................
50
2 .6 R eferen ces.............................................................................................................5
1
CHAPTER 3. The cell-cycle dependence of transcription is a dominant source of noise in
gene expression..................................................................................56
3 .1 A b stra ct ...............................................................................................................
56
3 .2 In trod u ction ..........................................................................................................
56
3 .3 R esu lts:.................................................................................................................5
8
3.3.1 Multiple transcription patterns result in expression distributions consistent
with transcriptional bursting......................................................................58
3.3.2 Static mRNA FISH reveals cell-cycle dependent expression may create
extrinsic noise in expression .....................................................................
60
3.3.3 A stochastic model to infer cell-cycle dependent transcription from mRNA
expression distributions.............................................................................64
3.3.4 Regulated transcription at low activator levels is restricted to S/G2;
Constitutive expression varies with gene dosage........................................65
3.3.5 Real-time fluctuations in protein levels corroborate mRNA measurements and
reveal globally correlated activation..........................................................72
3 .4 D iscu ssion :............................................................................................................75
3.4.1 Implications for understanding stochastic gene expression.........................75
3.4.2 Gene activation kinetics are also cell-cycle dependent ..............................
75
3.4.3 A hypothesis that chromatin maturation permits repressed transcription .... 76
3 .5 Referen ces .............................................................................................................
6
77
CHAPTER 4. Characterization of the tetO gene regulatory function using cell-cycle
arrested mRNA expression.................................................................82
4 .1 A b stra ct ...............................................................................................................
82
4 .2 In trod u ction ..........................................................................................................
83
4.3 R esu lts:.................................................................................................................8
4
4.3.1 Analysis of single-molecule nuclear and cytoplasmic mRNA FISH in arrested
84
cells reveals instantaneous cell-cycle dependent transcription...................
4.3.2 mRNA expression under cell-cycle arrest reveals that activator regulates
probabilistic activation of a long-lived transcribing state in S/G2.............90
4.3.3 Conditional variances quantify the origins of gene expression noise...........92
4.3.4 A model of transcription underlying arrested expression distributions reveals
that activator only regulates the probability and stability of activity, whereas
97
the promoter determines active transcription dynamics ............................
4.3.5 Reproducing cell-cycle kinetics with a model of S/G2/M and GI stationary
transcription dynam ics ...............................................................................
105
4.3.6 Cell-cycle dependent transcription at the yeast PHO5gene suggests
generality among regulated genes in yeast..................................................
4 .4 D iscu ssion :..........................................................................................................
108
109
4.4.1 Naive interpretation of expression distributions with the bursting
tran scription m odel ....................................................................................
109
4.4.2 Reconsidering cis and trans modes of regulating transcriptional dynamics in
y ea st ..........................................................................................................
1 12
4.4.3 Predictions about cell-cycle as a global transcriptional regulator in other
org an ism s ...................................................................................................
115
4 .5 C on clusion s.........................................................................................................
116
4 .6 R eferen ces...........................................................................................................
117
7
CHAPTER 5. Future directions ................................................................................
119
5 .1 S y n o p sis ..............................................................................................................
1 19
5.2 The origin of the S/G2 window for transcriptional activation.............................
119
5.3 The com plete m echanism of cell-cycle regulation of transcription .......................
122
5.4 Prevalence of cell-cycle driven transcription........................................................
125
5.5 Towards a generalized, predictive model for stochastic transcription dynamics... 126
5.6 Consequences of cell-cycle driven transcription in gene networks........................
128
5.7 Future techniques to observe transcription with single-molecule precision in realtim e ....................................................................................................................
5 .8 R eferen ces ...........................................................................................................
CHAPTER 6. Appendix ...........................................................................................
130
13 1
133
6.1 Yeast strains and plasmids .................................................................................
133
6 .2 P rotocols:............................................................................................................
134
6.2.1 Growth & arrest protocols .........................................................................
134
6.2.2 m RNA FISH ..............................................................................................
135
6.2.3 m RNA FISH im age analysis.......................................................................
135
6.2.4 Num erical solutions to stochastic models ...................................................
137
6.3 Quantification of mRNA dynamics .....................................................................
138
6.3.1 Nuclear and cytoplasm ic m RNA degradation half-life ................................
138
6.3.2 Rate of nuclear mRNA export....................................................................
139
6.4 m RNA FISH error analysis.................................................................................
140
6.5 References...........................................................................................................
143
8
CHAPTER 1.
1.1
Introduction
Genotype to phenotype: Regulation of transcriptional dynamics
An overarching goal of biology is to predict and explain cell and organism behavior in
response to a set of environmental conditions. Whole genome sequencing has become
commonplace, mapping not only genes but also the genetic regulatory elements that control
their expression. A new challenge is to decipher how genetic regulatory networks integrate
internal and external signals to actuate gene expression. A first-pass understanding of gene
regulation relates regulatory conditions, such as the regulatory DNA sequence and the level
of regulatory proteins, to the rate of mRNA or protein production. As such, a gene regulatory
networks can in principle be modeled by ordinary differential equations that explicitly
enumerate these regulatory relations. But transcription does not appear to occur in a
continuous, deterministic manner , but instead as a random, intermittent process where a
gene fluctuates between periods of activity and inactivity. Transcriptional dynamics include
a gene's fluctuations between states of varying transcriptional activity, despite unchanging
regulatory conditions. The resulting variability, or noise, in expression can affect the
behavior of gene regulatory networks, and so the origins of variability are important to
understand and predict gene function.
Noise in gene expression has been conceptually divided based on its origination from two
general sources: Intrinsic noise describes variability that originates from molecular noise in
the reactions inherent to transcription; extrinsic noise originates from variability in factors
that influence transcription (Elowitz et a., 2002). The next two sections discuss each of
these in turn, towards the thesis' central goal of characterizing how regulators and regulatory
elements modulate transcriptional dynamics and the resulting variability in gene expression.
9
1.2
Intrinsic expression noise from stochastic transcription dynamics
1.2.1
An introduction to stochastic transcription dynamics
Transcription occurs through a series of molecular interactions at a gene's promoter
leading to successful production of an mRNA. Like all chemistry, the interactions are
inherently stochastic. Most systems of chemical reactions involve large pools of each
molecular species (on the order of Avogadro's number) such that, while each molecular
transformation is stochastic, the macroscopic behavior is just the average behavior of all of
these molecules and can be described deterministically. However, a cell usually has just one
or two copies of a gene in its nucleus and the average number of mRNA produced can be
anywhere from 100-105, depending on the transcriptional activity and the organism. Thus
stochastic fluctuations in transcriptional activity can contribute to variability in mRNA and
protein expression between cells, or in a single-cell over time. Variability in expression at
steady-state is quantified by noise, which is the coefficient of variance (the standard
deviation divided by mean) of expression levels between cells in a population, or by the
Fano factor, which is the variance divided by the mean of expression in the population, with
intuitive units of mRNA or protein.
That stochastic chemistry could create biological variability was long-ago predicted from
physical principles (e.g. Schr6dinger,
1944). It was first suspected as the cause of
heterogeneous induction of the lac operon inherited over generations in isogenic cells (Novick
& Weiner, 1957). Interest was more recently revived when stochastic expression appeared
to explain the previously observed (DelbrOck, 1945) lysis/lysogeny decision of the phage
lambda (McAdams & Arkin, 1999; Arkin, Ross & McAdams, 1998). Genome-wide studies
of expression noise in budding yeast have since revealed that genes with highly regulated
expression
(such as stress-response genes) tend to be noisier than those that are
constitutively expressed (Bar-Even et a]., 2006; Newman et a]., 2006), suggesting a
qualitative difference in the dynamics of constitutive and regulated transcription.
The molecular biology of transcription can ground models that attempt to explain the
source and statistics of intrinsic variability. Constitutive genes that are not noisy are most-
10
often un-occluded by nucleosomes, existing in a state permissive to transcription, which is
limited by diffusion and assembly of the general transcription factors and machinery at fairly
regular time intervals. On the other hand, noisy genes regulated by the binding of one or
more activators to operators in their promoter are enriched in nucleosome-occluded
promoters, which rest in an inactive state where transcription is limited by chromatin
remodeling. Noisy genes are also more likely to have TATA boxes, which ensure strong
active transcription. Each of these aspects contribute to stabilize the "reinitiation complex"
containing Mediator and other transcription cofactors (Yudkovsky, Ranish & Hahn, 2000),
enabling several rounds of productive transcription after each slow-step of initiation (Figure
1-1). This fluctuation between promoter activity states is a potential origin of noise in gene
expression and informs models of transcription.
Mediator
Cofacto§
Gene
Nucleosome binding sites
Figure 1-1: Components of the reinitiation complex in yeast transcription. Cis activator binding
sites (red) stabilize trans activators (yellow) at the gene promoter providing a, foundation f)r
assembly of the complete initiation complex.
1.2.2
A model for understanding stochastic transcription dynamics
A useful abstraction of this underlying molecular biology is the "two-state promoter"
model. It was conceived with the earliest studies of gene regulation at the lac operon (Jacob
& Monod, 1961) then developed by Peccoud & Ycart (1995). The model, depicted by Figure
1-2 and outlined in Equation 1-1, supposes a gene exists in either an inactive OFF (I) or
active ON (A) state. When ON, it produces mRNA (M), which degrades by first-order
kinetics.
11
M
A
I
Figure 1-2: Thw-state promoter model of gene expression. A promoter can exist in an inactive (1)
or active (A) state and transitions between the states with rate A and y. An active promoterproduces
mRNA with rate p, which are degraded by first-orderkinetics with rate6.
A
1-1
'>
I
A
> A+M
M
40>
For low-noise constitutive genes, the model reduces to a single state, where the promoter
lives in the active state and produces mRNA with rate p. At steady-state, constant
stochastic production yields a Poisson distribution of mRNA. As such, Poisson mRNA
statistics represents a "null" case of minimal expression noise. Figure 1-3 shows an example
trajectory of production and degradation and the resulting steady-state mRNA expression
distribution. Poisson statistics of mRNA have been observed for multiple constitutive genes
in yeast (Larson et aL, 2011; Zenklusen, Larson & Singer, 2008).
B
A50
0.2
0.15
40
o 30
0<~
z 20
E
X5
0.1
0.05
0
2
4
6
Time (hour)
8
0
10
20
30
40
mRNA Count
50
Figure 1-3: The trajectozy and stationmiy expression distributionof a Poissonproductionprocess.
(A) mRAA count fluctuates over time due to stochastic single-molecule production and degradation
events (Burst frequency: A/6 = 10, Burst size: p/y = I mRNA). (B) Sampled over long times or
across a laige population, expression is a Poisson distribution with mean and variance of 10 nRNA.
12
On the other hand, single-cell studies of population distributions and real-time activity
suggest that expression of regulated eukaryotic genes can occur in bursts of transcription
consistent with a model where the promoter switches randomly and rarely from a stable
inactive state to a short-lived, actively-transcribing state (Larson et a]., 2011, Raj et a].,
2006). Rare transcriptional initiation and then rapid reinitiation (discussed above) may
represent the molecular basis for this dynamic behavior (Hahn, 1998; Struhl, 1996;
Yudkovsky, Ranish, & Hahn, 2000). In this case, the two-state promoter operates in the
"bursting" regime, with rare activation compared to inactivation (<<7)
fluctuations are faster than the mRNA lifetime (f >
and promoter
1 ), where tilde denotes that a
parameter is normalized by the degradation rate. In this regime, f is the burst frequency,
and ;i /
f
is the burst size and the distribution can be described by a two-parameter Gamma
distribution:
P X1
1-2
This is equivalent to the result of Friedman, Cai & Xie (2006). The discrete equivalent,
representing integer mRNA counts, is the negative binomial distribution. The two
parameters specify the burst frequency and burst size respectively and their product is the
expression mean. While the model says nothing about the actual molecular events leading
to transcription, the burst frequency is thought to correspond to the transcriptional
initiation rate and the burst size may correspond to the transcriptional reinitiation and/or
elongation rate. Figure 1-4 shows an example trajectory of transcriptional bursting and the
resulting steady-state expression distribution, parameterized by a burst frequency and burst
size, and fit with the negative binomial distribution of those parameters. (While Master
equations that describe the two-state model at steady-state have an analytical solution,
more complicated
stochastic models that lack analytical
solutions must be solved
numerically, with kinetic Monte Carlo simulations or a Finite Markov Chain method
(Munsky & Khammash, 2006).)
13
A50
BO.2
40
C
0
30
z
20
E
0.15
2
0-
0.1
0.05
10
0L
0
2
4
6
Time (hour)
0
8
0
10
30
20
mRNA Count
40
50
Fig-ure 1-4: The trajectory and stationaiyexpression distributionof a burstingprocess. (A) mRNA
count fluctuates widely over time due to "bursts" of transcription (Burst frequency: A16 = 1. Burst
size: p/y = 10 mRNA). (B) Sampled over long times or across a large population. expression is a
ne'g-ative binomial distribution with a mean of 10 mRNA andparameters corresponding to the burst
frequency and burst size. This bursting and the previous Poisson example (Figure1-3) have tMe same
mean expression level but very different distributions.
A regulatory element's mode of affecting transcriptional dynamics could be inferred from
how the expression distribution changes with mean expression. For the case of the two-state
promoter model, the product of the burst frequency and burst size gives the mean expression.
Transcriptional regulators could affect expression via the frequency or size of bursts.
Regulation of burst frequency will increase sampling as the mean increases, thereby
decreasing the noise; regulation of burst size will not (Figure 1-5).
B
A
Burst frequency regulation
Burst size regulation
Cg
Zi
0
Mean (log)
Signal
Figure 1-5: Regulation via burst frequency or burst size will affect expression noise. For two
hypotheticalgenes with equivalent mean expression in response to an activating signal (A), the gene
regulated via burst frequency will decrease expression noise as activation increases whereas the gene
under burst size control will not (B).
14
1.2.3
Conflicting evidence for cis and trans regulators' modes of controlling
transcription dynamics
Many studies have sought to observe and characterize transcriptional dynamics. These
studies use single-cell techniques that fall in two classes: measuring mRNA dynamics in realtime and back-calculating transcription events (corresponding to the trajectories of Figure
1-3 & 1-4 A); and measuring mRNA expression across a static cell population and inferring
steady-state transcriptional dynamics (as in the distributions of Figure 1-3 & 1-4 B).
Golding et a]. (2005) were the first to visualize transcription in real-time, introducing
repeats of secondary structure in mRNA that bound an MS2 phage coat protein fused to
GFP. This enabled counting the production of individual MS2-GFP-labeled
mRNA
transcripts using in growing F. coli.. mRNA appeared to be produced in "bursts" of
transcription. Studies in mammalian cells (Suter et a]. (2011) and Harper et aL. (2011)) using
luciferase as a readout suggest mammalian promoters are activated in bursts, followed by
an inactive refractory periods.
Results of earlier studies which inferred transcriptional dynamics from protein noise in
static populations are also consistent with bursting dynamics. Consistent with expectations,
when Ozbudak et a]. (2002) modulated bursting at the translational level in the prokaryote
B. subtilis (via point mutations that affect ribosome binding and thus translational
efficiency, p), noise remained high with increasing expression level (Kaern et a]., 2005).
Blake et a]. (2003) demonstrated the same in S. cerevisiae. Raser & O'Shea (2004) used a
yeast strain with two homologous reporters of PHO5 expression to measure intrinsic noise
at the protein level (2004). A mutant of the TATA-binding site, expected to decrease
transcription rate, p, decreased noise. A mutation of the activator binding site, expected to
decrease promoter activation, A, increased noise, both consistent with the noise-mean trends
of the bursting model.
Multiple studies have examined noise in protein expression genome-wide in both S.
cerevisiae (Bar-Even et a]., 2006; Newman et a]., 2006; Hornung et aL, 2012) and E coli
(Taniguichi, 2010). In yeast, these studies revealed the correlation between regulated and
15
noisy gene. They also found that, in general, noise decreased with the inverse square-root of
mean expression, suggesting that gene activity is predominantly controlled via modulating
the frequency of activation events. Similar global regulation of burst frequency was seen in
bacteria (Taniguichi, 2010). However regulated genes are often repressed in standard growth
conditions and thus all such genes may not be captured in these trends. Also, the inverse
square root scaling of noise with protein abundance was not seen clearly at high levels of
expression, where extrinsic noise is expected to dominate.
Additional studies have focused on measuring expression statistics from several different
promoters to infer how changes in cis regulatory elements can affect noise. Zenklusen et a].
(2008) found Poisson statistics at three constitutive genes, as expected. But expression of
the regulated PDR5 gene had higher noise, and an expression distribution well-fit by the
negative binomial distribution solution to stationary bursting. Carey et a]. (2013)
conditioned on activator-specific effects by measuring expression from multiple genes
activated by the same transcription factor, and saw that the level of noise, or "burstiness",
was a function of the promoter sequence. They also saw that the degree of noise depended
on whether the transcription factor acted as an activator or repressor (repression lowered
the apparent burst size). So et a]. (2011) measured expression from multiple bacterial genes
and concluded that both burst frequency (at lower levels) and burst size (at higher levels)
increased expression level. Dar et a]. (2012) and Skupsky et a]. (2010) integrated a single
promoter at many locations in a human genome and reported evidence of bursting across
all locations with burst frequency and then burst size increasing with mean expression.
However the latter study did not measure stationary expression, a requirement for the
applicability of the negative binomial noise-mean trends. A further cause-for-thought is that
this trend is consistent with the expected dominance of extrinsic noise at high expression
levels (discussed in the next section).
More detailed studies systematically varying regulatory elements within a single gene's
promoter has proven particularly effective for identifying regulatory trends. In yeast,
Murphy et a]. (2007) saw that placing transcription factor binding sites closer to the
promoter increased burst size, perhaps increasing the productivity or the stability of the
initiation complex. Both this study, work in our lab (To & Maheshri, 2010), and work in
16
mammalian cells (Raj et al. 2006, Suter et al. 2011) compared expression from promoters
identical except for the number of activator binding sites and showed noise increases with
binding site number. Dadiani et al. (2013) performed an interesting study in yeast,
engineering increased expression of a single gene via either increased binding site strength
or nucleosome-disfavoring sequence. The former substantially increased expression noise
("burst size"), the latter did not. This suggests the nucleosome disfavoring sequence
transitioned the dynamics out of the "bursting" regime and into continuous activity. Raj et
al. (2006) measured mRNA expression distributions from two identical copies of a gene in
single diploid mammalian cells and saw that fluctuations between the two loci were largely
uncorrelated, suggesting intrinsic origins. The shape of the distributions was consistent with
bursting.
Table 1-1 summarizes the conclusions of these studies according to whether they found
cis or trans regulators to affect burst size, frequency, or both. There is strong evidence for
cis elements, such as promoter architecture (binding site number, chromatin structure) and
local chromatin environment, dictating the "size" of transcriptional bursts. How both genespecific and global trans regulators affect bursting is less clear.
While the paradigm of "bursty" transcriptional dynamics seems prevalent, its veracity
and ubiquity is not conclusive. Muramoto et al. (2012) detected periodic, long-lived pulses
of transcription in real-time in Dictyostelum, rather than bursts. And particularly troubling
is the fact that models for transcriptional bursting describe the variability in mRNA levels
from purely intrinsic sources, whereas appreciable amounts of extrinsic noise have been
measured in a wide number of studies, although generally at the protein level. Using FACS
to measure protein noise does allows for gating by cell shape, but removes no other extrinsic
variability. Several studies (including Blake et al., 2003; Raser & O'Shea, 2004; So et al.,
2011; To & Maheshri, 2010) used scaling arguments to justify that noise in their data was
intrinsic: It was claimed that i 2 decreasing monotonically with mean,
a/papproaching one
at very low expression levels and q2 decreasing sharply with mean at high expression levels
were all indicative of intrinsic rather than extrinsic noise. Our noise data also has these
scaling properties, but its origins have proven to be predominantly extrinsic. Exceptions
17
that do explicitly account for extrinsic variability are few, including the original study with
two mRNA reporters (Raj et a]., 2006) and a theoretical study (Shahrezaei, Ollivier & Swain,
2008).
Table 1-1: Summarv of studies identifying cis- and trans- regulation of bursting dynamics
Burst frequency regulation
Burst size regulation
Binding site number or strength:
S. cerevisiae:
Raser & O'Shea 2004
Blake et a]. 2006
Murphy et aL. 2007
To & Maheshri 2010
Dadianiet aL 2013
Mammalian:
Raj et al. 2006
Suter et a]. 2011
TATA strength:
S. cerevisiae:
Raser & O'Shea 2004
Mogno et a]. 2010
Hornung et al. 2012
Cis
variation
Nucleosome occupancy/remodeling:
S. cerevisiaeBai et a]. 2010
Dadianiet a]. 2013
Genomic location:
Mammalian:
Skupsky et aL. 2010
Dar et a]. 2012
Multiple genes (promoters):
E. col:
So et aL. 2011
S. cerevisiae:
Hornung et a]. 2012
Activator levels or activity:
E. coli:
Pedraza & van Oudenaarden 2005
Golding et aL 2005
S. cerevisiae:
Raser & O'Shea 2004
Mao et a]. 2010
Activator levels:
E. col:
Choi et a]. 2008
S. cerevisiae:
Mao et a]. 2010
Carey et aL. 2013
Trans
variation
Global protein noise:
E. col:
Taniguchi et a]. 2010
S. cerevisiaeBar-Even et al. 2006
Newman et a]. 2006
_
__
_
_I
I
18
_
_
_
_
_
_
_
__
_
_
_
_
_
_
_
1.3
Extrinsic noise in gene expression
Extrinsic noise originates due to fluctuations in upstream factors that impact gene
expression. Upstream fluctuations can occur in global factors (affecting all genes) or genespecific pathway components. Upstream fluctuations may also derive from intrinsic noise in
their expression, but it need not be the case. For example, cell size seems an important
determinant of global transcriptional activity (Raj, unpublished). The cell-cycle, including
the doubled DNA copy number of S/G2/M, may also be a potential source of transcription
variability (Elliott & McLaughlin, 1978, Volfson et a]., 2006). Stochastic partitioning of
mRNA and protein upon cell division is another source of extrinsic noise, and appeared as
intrinsic noise in a popular experimental method to measure contributions from the two
sources (Huh & Paulsson, 2011). The early experimental example of consequential stochastic
gene expression, the lysis/lysogeny decision of phase lambda, was later attributed to
extrinsic sources (St-Pierre & Endy, 2008). Hilfinger & Paulsson (2011) noted that even
intrinsic noise parameters will depend on extrinsic factors in the history of the cell. Studies
of global protein expression (Bar-Even et a]., 2006) showed evidence of a baseline of extrinsic
noise, with noise never measured below a coefficient of variation of 0.2. Yet these important
sources of extrinsic variability are suggested to have small, non-qualitative effects on
variability of mRNA and transcription, because of the large wealth of experimental results
consistent with the transcriptional bursting model (Table 1-1).
19
1.4
Consequences of noise from stochastic transcription dynamics
Stochastic transcription and expression variability is of particular interest because it can
have qualitative consequences for phenotype. In synthetic gene circuits, noise has stabilized
toggle switches and oscillators (Becskei, Seraphin & Serrano, 2001; Elowitz & Leibier, 2000;
Gardner, Cantor & Collins, 2000). Our lab previously demonstrated that noise can cause
bimodal expression in a transcriptional positive feedback loop even when deterministic
models predict no bistability (To & Maheshri, 2010). Stochastic expression also has
consequences in evolution and development of organisms. Heterogeneous expression among
a population of isogenic unicellular organisms may lead to variability that assists with
responses to changes in nutrients (e.g. lactose utilization, Ozbudak et a]., 2004), to stress
(e.g. competence in B. subtilis, Maamar, Raj & Dubnau, 2007; Suel et a]., 2006; Suel et al,
2007) or to pathogens (e.g. bacterial persistence against antibiotics, Blake et aL., 2006) and
may play a role in development (e.g. variability in a stem cell marker correlated strongly
with choice of lineage, Chang et a]., 2008).
But consistent with intuition about control systems, noise in gene expression can also be
detrimental, limiting information transfer (Bialek & Setayeshgar, 2005; Lestas, Vinnicombe
& Paulsson, 2010), such that gene regulatory networks have evolved to suppress noise effects
(McAdams & Arkin, 1999; Raj
et a]., 2010). Noise from transcriptional dynamics may be an
unavoidable biophysical limitation of achieving a highly-regulable range of transcription
(Bremer & Ehrenberg, 1995; Guptasarma, 1996; Salman et a]., 2012). (One counterargument is that ribosomal genes, which have high dynamic range with little noise. But
these genes fundamentally differ, transcribed by their own polymerase (RNA Pol I) with
ON/OFF, rather than graded, control.) In either case, the fact that organisms are known
to both exploit and evolve to minimize noise is evidence of its significance for gene regulatory
network performance.
20
1.5
Thesis aim and summary
One summary of the current understanding of transcriptional dynamics is: Transcription
occurs in continuous, stochastic burst events, with a frequency regulated by activator level
and a size dependent on promoter architecture and constraints of the molecular biology of
transcription in the particular organism. But studies have emerged suggesting that extrinsic,
global regulators of gene expression have gone un-appreciated, suggesting a direction for the
field to advance. Another question remaining answered is how transcriptional dynamics
control the kinetics of a response to changing regulatory signals. This thesis is a case of the
former, exploring a novel case of global transcriptional regulation. While kinetics are not
covered here, characterization of stationary dynamics is a key step towards predicting kinetic
behavior.
This work aims to characterize transcriptional dynamics, and thus the origin of noise in
gene expression, at the yeast tetO gene in response to cis and transregulators and the cellcycle, which we reveal as a global regulator of transcription. In contrast to current
understanding of single-cell dynamics and noise, we establish that large difference in
transcription between cell-cycle stages drives noisy expression at the tetO promoter, and
suggest this may be prevalent at other regulated yeast genes. Specifically, we ask: What are
the dynamics of transcription in response to the level of activator proteins, number of
activator binding sites at the gene promoter, and global process of the cell-division cycle?
In Chapter 2 we present a case study demonstrating how the mode of regulating
transcriptional dynamics can affect phenotype. We consider a hypothetical pair of promoters
transcribed with bursting dynamics: one whose expression is modulated by the frequency of
bursts, and the other by the size of bursts. Hence they differ in how their intrinsic noise
varies with changes in expression level. We analyze expression in positive feedback and see
that regulation via burst frequency can create bimodal expression, or stabilize deterministic
bistability; whereas regulation of burst size never creates bimodality, and instead destabilizes
bistability. Hence regulators' mode of controlling dynamics is important for gene function.
21
Previous studies' attempts to infer transcription dynamics have been quick to attribute
all noise in expression to the model of bursting transcription. But a strain with two copies
of our "noisy" gene of interest revealed that much of the noise we observed was extrinsic,
deriving from sources other than transcription itself. When probing the origin of this
extrinsic noise, we uncovered a strong dependence of transcription on the cell-cycle.
In Chapter 3, we establish the nature of this cell-cycle driven transcription. We measure
transcription rates with two single-cell methods: by tracking protein levels in real-time we
back-calculate time courses of transcription rates, albeit with relatively low molecular and
temporal resolution; by counting single mRNA molecules in a static population segregated
by cell-cycle stage, we obtain a higher resolution readout of recent transcription but with
no temporal information. An immediate result is that transcription increases 2-fold from GI
to S/G2/M for constitutive or highly-expressed, regulated genes. Though previously
underappreciated, this is expected because the copy number of each gene doubles during Sphase DNA replication between G1 and G2. We show that this dominates super-Poissonian
noise in growing populations' constitutive gene expression. Of greater interest, we find that
transcription under repressed conditions only occurs in S/G2/M. Clearly, something is at
play alleviating transcriptional repression in early S/G2. We hypothesize that chromatin
maturation following DNA replication creates a permissive window for transcription from
otherwise inactive genes.
But the resolution afforded by these techniques limited further characterization of
transcriptional dynamics. mRNA expression blurs fluctuations in transcription on the
timescale of its turnover. In Chapter 4, we seek a new approach to quantify both the cellcycle dependence of transcription and transcriptional dynamics within a single cell-cycle
stage. We find that cell-cycle arrest obtains mRNA expression that reflects the pseudosteady-state transcription dynamics of each cell-cycle stage. Levels of mRNA at the site of
transcription in the nucleus provides a more immediate readout of transcriptional activity
than the cytoplasmic mRNA, which have undergone processing and export.
We find that cells with or without nuclear mRNA have very different levels of
cytoplasmic mRNA, indicating an active state lifetime on the timescale of a cell-cycle stage.
Activation occurs in S/G2 with a probability determined by activator level. Activator also
22
modulates the stability of the active state across the M/G1 transition, such that there is a
distinct activator threshold below which there is no GI expression. But expression from an
active loci is essentially activator-independent. An active loci with a single operator has
minimally-variable
Poisson
expression statistics; multiple operators
result in higher
expression noise, perhaps accessing multiple activity states that correspond to discrete
promoter occupancy states. The growing population's normalized variance (i.e. the Fano
factor) peaks at an activator level that is finally revealed to derive from noise peaking when
50% of cells activate in S/G2. This is a new picture of transcriptional dynamics that
challenges previous expectations of transcriptional bursting dynamics.
This thesis answers some questions but creates many more. We hypothesize that S/G2
alleviation of repression is due to the delay in maturation and assembly of new chromatin
following DNA replication. Several studies link chromatin maturation to transcription, but
is this at play here, and if so what is the molecular mechanism? Is this phenomena of cellcycle dependent de-repression of transcription universal, or shared among a smaller class of
genes, and why? Can it affect protein-level behavior within a signaling network? What
developments in experimental techniques are necessary to complete our understanding of
transcriptional dynamics? We discuss these open questions in Chapter 5.
Together, this thesis reveals a novel pattern of transcriptional regulation and provides
an example of the importance of considering cell context when studying gene expression and
regulation. The role of cell cycle as a regulator is remarkable and previously unappreciated
in the context of noisy gene expression. It has implications for network design in synthetic
biology, and for understanding gene expression perturbed by disease, particularly in the fastgrowing cells of cancers and developing organisms.
This work to characterize
the
transcriptional dynamics of a single gene is one very small step towards developing
generalized and predictive models of gene regulation, which may eventually enable us to
understand and engineer biology as we would any other physical system.
23
1.6
References
Arkin, A., Ross, J., & McAdams, H. H. (1998). Stochastic kinetic analysis of
developmental pathway bifurcation in phage-lambda infected Escherichia coli cells.
Genetics, 149(4), 1633-1648.
Bai, L., Charvin, G., Siggia, E. D., & Cross, F. R. (2010). Nucleosome-depleted regions in
cell-cycle-regulated promoters ensure reliable gene expression in every cell cycle.
Developmental Cell, 18(4), 544-555.
Bar-Even, A., Paulsson, J., Maheshri, N., Carmi, M., O'Shea, E., Pilpel, Y., & Barkai, N.
(2006). Noise in protein expression scales with natural protein abundance. Nature
Genetics, 38(6), 636-643.
Becskei, A., Seraphin, B., & Serrano, L. (2001). Positive feedback in eukaryotic gene
networks: Cell differentiation by graded to binary response conversion. EMBO
Journal,20(10), 2528-2535.
Bialek, W., & Setayeshgar, S. (2005). Physical limits to biochemical signaling. Proceedings
of the NationalAcademy of Sciences of the United States of America, 102(29), 1004010045.
Blake, W. J., Balazsi, G., Kohanski, M. A., Isaacs, F. J., Murphy, K. F., Kuang, Y.,
Collins, J. J. (2006). Phenotypic consequences of promoter-mediated transcriptional
noise. Molecular Cell, 24(6), 853-865.
Blake, W. J., Kaern, M., Cantor, C. R., & Collins, J. J. (2003). Noise in eukaryotic gene
expression. Nature, 422(6932), 633-637.
Bremer, H., & Ehrenberg, M. (1995). Guanosine tetraphosphate as a global regulator of
bacterial RNA synthesis: A model involving RNA polymerase pausing and queuing.
Biochiinica Et Biophysica Acta (BBA)-Gene Structure and Expression, 1262(1), 1536.
24
Carey, L. B., van Dijk, D., Sloot, P. M. A., Kaandorp, J. A., & Segal, E. (2013). Promoter
sequence determines the relationship between expression level and noise. PLoS Biol,
11(4), e1001528.
Chang, H. H., Hemberg, M., Barahona, M., Ingber, D. E., & Huang, S. (2008).
Transcriptome-wide noise controls lineage choice in mammalian progenitor cells.
Nature, 453(7194), 544-547.
Cheung, A. M., & Cramer, P. (2012). A movie of RNA polymerase II transcription. Cell,
149(7), 1431-1437.
Choi, P. J., Cai, L., Frieda, K., & Xie, X. S. (2008). A stochastic single-molecule event
triggers phenotype switching of a bacterial cell. Science, 322(5900), 442-446.
Dadiani, M., van Dijk, D., Segal, B., Field, Y., Ben-Artzi, G., Raveh-Sadka, T., Segal, E.
(2013). Two DNA-encoded strategies for increasing expression with opposing effects
on promoter dynamics and transcriptional noise. Genome Research, 23(6), 966-976.
Dar, R. D., Razooky, B. S., Singh, A., Trimeloni, T. V., McCollum, J. M., Cox, C. D., ...
Weinberger, L. S. (2012). Transcriptional burst frequency and burst size are equally
modulated across the human genome. Proceedings of the NationalAcademy of
Sciences, 109(43), 17454-17459.
Delbruck, M. (1945). The burst size distribution in the growth of bacterial viruses
(bacteriophages). The Journalof Bacteriology, 50(2), 131-135.
Elliott, S. G., & McLaughlin, C. S. (1978). Rate of macromolecular synthesis through the
cell cycle of the yeast saccharomyces cerevisiae. Proceedingsof the National Academy
of Sciences, 75(9), 4384-4388.
Elowitz, M. B., & Leibier, S. (2000). A synthetic oscillatory network of transcriptional
regulators. Nature, 403(6767), 335-338.
25
Elowitz, M. B., Levine, A. J., Siggia, E. D., & Swain, P. S. (2002). Stochastic gene
expression in a single cell. Science, 2945584), 1183-1186.
Friedman, N., Cai, L., & Xie, X.
S. (2006). Linking stochastic dynamics to population
distribution: An analytical framework of gene expression. PhysicalReview Letters,
97(16), 168302.
Gardner, T. S., Cantor, C. R., & Collins, J. J. (2000). Construction of a genetic toggle
switch in escherichia coli. Nature, 403(6767), 339-342.
Guptasarma, P. (1996). Cooperative relaxation of supercoils and periodic transcriptional
initiation within polymerase batteries. Bioessays, 18(4), 325-332.
Hahn, S. (1998). Activation and the role of reinitiation in the control of transcription by
RNA polymerase II. Cold Spring HarborSymposia on QuantitativeBiology, 63, 181188.
Harper, C., Finkenst~dt, B., Woodcock, D., Friedrichsen, S., Semprini, S., Ashall, L.,
White, M. (2011). Dynamic analysis of stochastic transcription cycles. PLoS Biology.
9(4), e1000607.
Hilfinger, A., & Paulsson, J. (2011). Separating intrinsic from extrinsic fluctuations in
dynamic biological systems. Proceedingsof the NationalAcademy of Sciences,
108(29), 12167-12172.
Hornung, G., Bar-Ziv, R., Rosin, D., Tokuriki, N., Tawfik, D. S., Oren, M., & Barkai, N.
(2012). Noise-mean relationship in mutated promoters. Genome Research, 22(12),
2409-2417.
Huh, D., & Paulsson, J. (2011). Non-genetic heterogeneity from stochastic partitioning at
cell division. Nature Genetics, 43(2), 95-100.
26
Huisinga, K. L., & Pugh, B. F. (2004). A genome-wide housekeeping role for TFIID and a
highly regulated stress-related role for SAGA in saccharomyces cerevisiae. Molecular
Cell, 13(4), 573-585.
Jacob, F., & Monod, J. (1961). On the regulation of gene activity. Cold Spring Harbor
Symposa on QuantitativeBiology, 26, 193-211.
Jacob, F., & Monod, J. (1961). Genetic regulatory mechanisms in the synthesis of
proteins. Journalof Molecular Biology, 3(3), 318-356.
Kaern, M., Elston, T. C., Blake, W. J., & Collins, J. J. (2005). Stochasticity in gene
expression: From theories to phenotypes. Nature Reviews. Genetics, 6(6), 451-464.
Larson, D. R.., Zenklusen, D., Wu, B., Chao, J. A., & Singer, R. H. (2011). Real-time
observation of transcription initiation and elongation on an endogenous yeast gene.
Science, 332(6028), 475-478.
Lestas, I., Vinnicombe, G., & Paulsson, J. (2010). Fundamental limits on the suppression
of molecular fluctuations. Nature, 467(7312), 174-178.
Maamar, H., Raj, A., & Dubnau, D. (2007). Noise in gene expression determines cell fate
in bacillus subtilis. Science, 317(5837), 526-529. doi:10.1126/science.1140818
Mao, C., Brown, C. R., Falkovskaia, E., Dong, S., Hrabeta-Robinson, E., Wenger, L., &
Boeger, H. (2010). Quantitative analysis of the transcription control mechanism. Mol
Syst Bid, 6(1),
-.
McAdams, H. H., & Arkin, A. (1999). It's a noisy business! Genetic regulation at the
nanomolar scale. Trends in Genetics, 15(2), 65-69.
Mogno, I., Vallania, F., Mitra, R. D., & Cohen, B. A. (2010). TATA is a modular
component of synthetic promoters. Genome Research, 20(10), 1391-1397.
27
Munsky, B., & Khammash, M. (2006). The finite state projection algorithm for the
solution of the chemical master equation. The Journalof Chemical Physics, 124(4),
-.
Murakami, K. S., & Darst, S. A. (2003). Bacterial RNA polymerases: The whole story.
Current Opinion in StructuralBiology, 13(1), 31-39.
Muramoto, T., Cannon, D., Gierliaski, M., Corrigan, A., Barton, G. J., & Chubb, J. R.
(2012). Live imaging of nascent RNA dynamics reveals distinct types of
transcriptional pulse regulation. Proceedings of the National Academy of Sciences,
109(19), 7350-7355.
Murphy, K. F., Balszsi, G., & Collins, J. J. (2007). Combinatorial promoter design for
engineering noisy gene expression. Proceedingsof the NationalAcademy of Sciences,
104(31), 12726-12731.
Newman, J. R. S., Ghaemmaghami, S., Ihmels, J., Breslow, D. K., Noble, M., DeRisi, J.
L., & Weissman, J. S. (2006). Single-cell proteomic analysis of S. cerevisiae reveals
the architecture of biological noise. Nature, 441(7095), 840-846.
Novick, A., & Weiner, M. (1957). Enzyme induction as an all-or-none phenomenon.
Proceedingsof the NationalAcademy of Sciences of the United States of America,
43(7), 553-566.
Ozbudak, E. M., Thattai, M., Lim, H. N., Shraiman, B. I., & Van Oudenaarden, A.
(2004). Multistability in the lactose utilization network of escherichia coli. Nature,
427(6976), 737-740.
Ozbudak, E. M., Thattai, M., Kurtser, I., Grossman, A. D., & van Oudenaarden, A.
(2002). Regulation of noise in the expression of a single gene. Nature Genetics, 31(1),
69-73.
Peccoud, J., & Ycart, B. (1995). Markovian modeling of gene-product synthesis.
TheoreticalPopulation Biology, 48(2), 222-234.
28
Pedraza, J. M., & van Oudenaarden, A. (2005). Noise propagation in gene networks.
Science, 307(5717), 1965-1969.
Raj, A., Rifkin, S. A., Andersen, E., & van Oudenaarden, A. (2010). Variability in gene
expression underlies incomplete penetrance. Nature, 463(7283), 913-918.
Raj, A., Peskin, C., Tranchina, D., Vargas, D., & Tyagi, S. (2006). Stochastic mRNA
synthesis in mammalian cells. PLoSBio, 4(10), e309-e309.
Raj, A., van, d. B., Rifkin, S. A., van Oudenaarden, A., & Tyagi, S. (2008). Imaging
individual mRNA molecules using multiple singly labeled probes. Nat Meth, 5(10),
877-879.
Raser, J. M., & O'Shea, E. K. (2004). Control of stochasticity in eukaryotic gene
expression. Science, 304(5678), 1811-1814. doi:10.1126/science.1098641
Salman, H., Brenner, N., Tung, C., Elyahu, N., Stolovicki, E., Moore, L., Braun, E.
(2012). Universal protein fluctuations in populations of microorganisms. Physical
Review Letters, 108(23), 238105.
Sanchez, A., & Golding, I.(2013). Genetic determinants and cellular constraints in noisy
gene expression. Science, 342(6163), 1188-1193.
Schr6dinger, E. (1944). What is life? Cambridge University Press.
Skupsky, R., Burnett, J. C., Foley, J. E., Schaffer, D. V., & Arkin, A. P. (2010). HIV
promoter integration site primarily modulates transcriptional burst size rather than
frequency. PLoS ComputationalBiology, 6(9), e1000952.
So, L., Ghosh, A., Zong, C., Sepulveda, L. A., Segev, R., & Golding, I. (2011). General
properties of transcriptional time series in Escherichia coli. Nature Genetics, 43(6),
554-560.
29
St-Pierre, F., & Endy, D. (2008). Determination of cell fate selection during phage lambda
infection. Proceedings of the NationalAcademy of Sciences, 105(52), 20705-20710.
Struhl, K. (1996). Chromatin structure and RNA polymerase II connection: Implications
for transcription. Cell, 84(2), 179-182.
Suel, G. M., Garcia-Ojalvo, J., Liberman, L. M., & Elowitz, M. B. (2006). An excitable
gene regulatory circuit induces transient cellular differentiation. Nature, 440(7083),
545-550.
Suel, G. M., Kulkarni, R. P., Dworkin, J., Garcia-Ojalvo, J., & Elowitz, M. B. (2007).
Tunability and noise dependence in differentiation dynamics. Science, 315(5819),
1716-1719.
Suter, D. M., Molina, N., Gatfield, D., Schneider, K., Schibler, U., & Naef, F. (2011).
Mammalian genes are transcribed with widely different bursting kinetics. Science,
332(6028), 472-474.
Taniguchi, Y., Choi, P. J., Li, G., Chen, H., Babu, M., Hearn, J., Xie, X. S. (2010).
Quantifying E. coli proteome and transcriptome with single-molecule sensitivity in
single cells. Science, 329(5991), 533-538.
Tirosh, I., Barkai, N., & Verstrepen, K. J. (2009). Promoter architecture and the
evolvability of gene expression. Journalof Biology, 8(95).
Tirosh, I., & Barkai, N. (2008). Two strategies for gene regulation by promoter
nucleosomes. Genome Research, 18(7), 1084-1091.
To, T., & Maheshri, N. (2010). Noise can induce bimodality in positive transcriptional
feedback loops without bistability. Science, 327(5969), 1142-1145.
Volfson, D., Marciniak, J., Blake, W. J., Ostroff, N., Tsimring, L. S., & Hasty, J. (2006).
Origins of extrinsic variability in eukaryotic gene expression. Nature, 439(7078), 861864.
30
Wang, Z., Gerstein, M., & Snyder, M. (2009). RNA-seq: A revolutionary tool for
transcriptomics. Nature Reviews. Genetics, 10(1), 57-63.
Yudkovsky, N., Ranish, J. A., & Hahn, S. (2000). A transcription reinitiation intermediate
that is stabilized by activator. Nature, 408(6809), 225-229.
Zenklusen, D., Larson, D. R., & Singer, R. H. (2008). Single-RNA counting reveals
alternative modes of gene expression in yeast. Nat Struct Mol Biol, 15(12), 1263-1271.
Zhang, Z., Revyakin, A., Grimm, J. B., Lavis, L. D., Tjian, R., & Kadonaga, J. T. (2014).
Single-molecule tracking of the transcription cycle by sub-second RNA detection.
ELife Sciences, 3.
31
CHAPTER 2.
Mode of transcriptional regulation can
qualitatively affect gene behavior in positive feedback
2.1
Abstract
Cellular information processing often employs multi-stability for decision-making,
memory and bet-hedging. Within gene networks, multi-stability is accomplished via positive
feedback loops. We demonstrate with a theoretical case study that gene expression noise in
these networks either stabilize/create or destabilize/eliminate bimodal gene expression
patterns
when
respectively.
transcriptional
activators
modulate burst
frequency
or
burst
This illustrates how the mode by which regulatory elements
size,
actuate
transcription can have profound implications for network and cell behavior. Hence correct
characterization of stochastic transcription dynamics are important for design and analysis
of genes networks.
32
2.2
Introduction: Modes of regulating transcriptional bursting
Myriad studies have demonstrated that gene expression is stochastic, and a stochastic
description of even simple regulatory networks can lead to unintuitive and even qualitatively
different behavior as compared to a deterministic description (Acar, Becskei & van
Oudenaarden, 2005; Blake et a]., 2006; Cagatay et a]., 2009; Elowitz et a]., 2002; Kaern et
a]., 2005; Maamar, Raj & Dubnau, 2007; Maheshri & O'Shea, 2007; McAdams & Arkin,
1997; Raser & O'Shea, 2004; Suel et al., 2006; To & Maheshri, 2010; Turcotte, Garcia-Ojalvo
& Suel, 2008). For example, stochastic noise propagated through networks can either
increase (Rosenfeld et al., 2005) or decrease (Paulsson & Ehrenburg, 2000; Thattai & van
Oudenaarden, 2001) variability of downstream gene expression. It has been debated whether
noise in gene expression could have evolutionary benefits, such as in creating diversity in
response to a change in environmental conditions (e.g. the decision between the lytic or
lysogenic response in phage infected bacteria), (Raj & van Oudenaarden, 2008) or perhaps
noise is an unavoidable consequences of molecular events and evolution has uniformly
selected against noise (McAdams & Arkin, 1999; Raj et al., 2010).
Recent single molecule approaches suggest that noisy gene expression largely arises from
random and intermittent "bursts" of transcription (Cai, Friedman & Xie, 2006; Chubb et
al., 2006; Golding et al., 2005; Raj & van Oudenaarden, 2009; Taniguchi et al., 2010). A
two-state promoter model has been employed to interpret these results (Peccoud & Ycart,
1995; Raj et al., 2006; Shahrezaei, Ollivier & Swain, 2008; introduced in Chapter 1). It
models promoters transitioning between inactive and active states, and produces transcript
when active (Figure 1-2). This model has three kinetic parameters -- the promoter activation
rate, the promoter deactivation rate, and the transcription rate when active. The observed
"bursty" transcription is consistent with rare, transitions of the promoter from the inactive
to a short-lived, but highly productive active state, resulting in a burst of mRNA expression.
The statistics of bursting can be succinctly described by two parameters: the burst frequency
(the promoter activation rate) which is the number of bursts over the lifetime of mRNA or
protein and the burst size (the ratio of the transcription rate to the deactivation rate), which
is the number of mRNA or proteins produced per burst.
33
For a "bursty 7 gene, the mean level of gene expression is simply the product of burst
size and burst frequency. Transcription factors that modulate this mean level may do so by
either affecting burst frequency, burst size, or a combination of both (Hahn, 1998). Which
mode of regulation is being employed can be inferred by examining how the intrinsic noise
in gene expression scales with the mean expression level (Friedman, Cai & Xie, 2006; Raser
& O'Shea, 2004). When activators regulate burst frequency, gene expression noise decreases
with increased expression (with a square root dependence on protein abundance); in
contrast, when activators regulate burst size, gene expression noise is constant (Figure 2-1).
Several experimental studies have examined the dependence of gene expression noise on
cis and trans factors. Strong TATA boxes (Mogno et a]., 2010; Raser & O'Shea, 2004) and
higher number and affinity of activator binding sites (Raj et a]., 2006; Suter et a]., 2011; To
& Maheshri, 2010) appears to increase burst size. In cases where intrinsic noise in protein
expression was directly measured in response to changing a transcriptional activator, such
as at the PHO5 gene in budding yeast (Raser & O'Shea, 2004) or a lac promoter variant in
E. coli (Pedraza & van Oudenaarden, 2005), the intrinsic noise appears to scale with the
inverse square root of protein abundance, characteristic of burst frequency regulation.
Still, a wealth of biochemical evidence exists for activators (and repressors) to influence
transcriptional reinitiation and thereby potentially burst size (Hahn, 2004). In eukaryotes,
burst sizes have been measured in the range of 10O-102 mRNA (reviewed in Sanchez &
Golding, 2013). Generally, regulable genes exhibit low basal expression, including a basal
burst size in the absence (or presence) of activators (or repressors). A typical average mRNA
copy number of such repressed genes in yeast is -0.1-1 mRNA per cell (Holland, 2002). If
these genes were subject to pure burst frequency regulation, then the basal burst size would
remain 10-100 mRNAs at very low levels of expression. That would mean a small but
measurable (-0.1%) fraction of cells would appear strongly ON for gene expression, yet this
has not been reported. Therefore, it would seem that burst size must be regulated at some
point in the transition from basal to regulated expression. Single molecule studies examining
the synthetic Tet OFF system in HeLa cells (Raj et a., 2006) and the lac repressor (Choi
et a]., 2008) in F. coli provide evidence of burst size regulation. Global studies of noise in
protein expression in E. co]] (Taniguchi et a]., 2010) and yeast (Bar-Even et a]., 2006;
34
Newman et a]., 2006) show that noise of the majority of these genes scale with the inverse
square root of abundance at low to intermediate levels of expression when extrinsic noise
does not dominate. One interpretation consistent with the data is that the higher expression
level of stronger promoters is due burst frequency. While this may be true for many
"constitutive", housekeeping genes that tend to be less noisy, highly regulable genes tended
to deviate from this dependence (Bar-Even et a]., 2006; Zenklusen, Larsen & Singer, 2008).
Close examination of the E. coli data set indicates that low expressing promoters
(corresponding to <20 proteins) span a 10-fold range of both burst size and frequency
(Taniguchi et aL., 2010). This is consistent with recent examination of burst statistics in
several different E. coli promoters (So et a]., 2011) that suggests that differences in
expression level are largely due to changes in burst size. Taken together, the notion that
burst size and frequency can both be regulated by trans factors for many genes seems a
reasonable one.
Here, we investigate the key differences burst size and frequency regulation can have on
the outcome of simple gene circuits involving feedback loops. Positive feedback with burst
frequency regulation has previously been shown to stabilize deterministically bistable states
or create a bimodal expression distribution when a bistability is not predicted (Friedman,
Cai & Xie, 2006; Karmakar & Bose, 2007; Samoilov, Plyasunov & Arkin, 2005; To &
Maheshri, 2010). Using both analytical and computational methods, we demonstrate that
burst size regulation can have the opposite effect, destabilizing deterministically bistable
states and eliminating bimodal expression.
35
2.3
Theory:
2.3.1
Steady-state expression of frequency- or size- regulated stochastic
transcriptional bursting in feedback control
We use a standard two promoter state model of gene expression (Raj et aL, 2006):
2-I
A
'>
A
* > A+x
x
i>
I
All rates have been normalized by the lifetime of x. If fluctuations are only assumed to
arise due to stochastic transitions between the inactive and active promoter states, then a
Fokker-Planck equation can be formulated for the probability density of x (Raj et a]., 2006):
d
(f(x)p(x)) = g(x)p(x)
dx
f x = -x1-xX P)
g(x) =-AG1- X/ P)+ YX / P
While x naturally corresponds to mRNA, we will assume coupled transcription and
translation so that it may correspond to protein. This assumption is exact when mRNA and
protein lifetimes are largely different (Raj et a1, 2006). The continuum approximation on x
is valid when the production of x is high and p/5 > 1. The solution to the Fokker-Planck
equation has been shown to be a Beta distribution (Raj et a], 2006). Under conditions of
"bursty
gene expression where gene activation is rare compared to activation
A << y and
promoter fluctuations are slower than the mRNA/protein degradation rate y > 1, the
solution simplifies to a Gamma distribution (Friedman, Cai & Xie, 2006) characterized by
two parameters: the burst frequency A and the burst size p /
y . (The discrete equivalent,
without the continuum approximation, is the negative binomial distribution.)
36
We extend this model to the case of feedback where x increases gene expression by either
increasing burst frequency or burst size. To do so, we let A and 2' depend explicitly on x
using a Hill-like functional form:
2-3
A 2(CA +±(1+(K2 .Am
2-4
v=vo(,7+ (1I(/,)"))
where
x
x/,p is a normalized mRNA/protein number with respect to the maximum
level if the promoter were always in the active configuration. Both Ag and
threshold value for half-maximal activation,
K represent a
3
oE, and o(1+E,) represent basal expression,
and n and m are the Hill coefficient. We then solve the modified Fokker-Planck equation
including either Eqn. 2-3 or 2-4 to find the steady-state probability density for the two types
of regulation in the case of feedback.
2-5
Pf (x) = C(K"+ "')x ' x
2-6
Pb,()=
C exp
L-Yo
(1+±
(1-X)
/k
)-I(1
-y)-
2-
(1 -)Y--
Here, C represents a normalization constant. We have previously reported 2-5 in the
Supplemental Text of (To & Maheshri, 2010). The integral in 2-6 can be evaluated in closedform for n = 1. Our use of burst size and frequency as subscripts is premature, because 2-5
and 2-6 do NOT assume bursty gene expression - they only require that promoter
fluctuations are the sole source of variability.
Making the "bursty
2-7
gene expression assumption for 2-5 yields:
P(x)c (K, +x"')'x
1
exp[-x / ]
which is equivalent to equation 9 in (Friedman, Cai & Xie, 2006). That equation was
derived assuming translational bursts in protein levels, although it was argued that the same
equation applies to transcriptional bursts from promoter fluctuations, which is in agreement
to the results here. Moreover, while in this model the only source of noise is promoter
fluctuations, in the "bursty" limit, the noise from random bursts and deaths of x is negligible.
Finally, an alternative expression to 2-6 can be derived if x influences the transcription rate
37
/I (Section 2.3.3), but both expressions are identical in the "bursty" limit, as
p
and
7 are
no longer independent and both influence burst size.
We note these results provide an analytical solution to the total distribution in the case
of feedback regulation, not just the moments. Analytical solutions for feedback in genes with
Hill-like functions (Friedman, Cai & Xie, 2006; Karmakar & Bose, 2007) have been limited
to burst frequency regulation. An exact analytical solution for feedback has been reported
with no continuum approximation (Hornos et a]., 2005) but it is restricted to a repressor
increasing the promoter deactivation rate in a linear manner.
2.3.2
The regime of bimodal expression
The analytical expressions for the mRNA/protein (x) distribution in feedback provide a
convenient starting point to determine if and when the distribution is bimodal. To derive
the bimodal conditions, the general strategy is to determine the number of extrema in the
distribution by analyzing the number of real roots of the derivative of the distribution.
Importantly, we only have to restrict our attention to the range i e [0,1].
For burst frequency regulation, we start by taking the derivative of 2-5:
V
dpf
di Pbf
6
-
+
+
First, it is worth noting that nature of the distribution at the boundaries i= 0,1 is
simply given by Is' and
3
rd
terms in the bracketed expression of 2-8. For 1 c, greater than
1 (high basal burst frequency) the slope of the distribution at x
0 is positive and the
probability mass is shifted from 0. If y ever falls below 1 (slow promoter fluctuations), then
the slope of the distribution at
i = 1 is positive and the probability mass is shifted to 1. (If
the promoter were always on the distribution would be a delta function at x =
mRNA/protein dynamics are deterministic.)
38
1 because
>> 1 and both the third term is always negative. The second
Under bursty conditions
term is always positive. If I{fA> 1, then the first term is also always positive. Therefore the
derivative changes from positive to negative only once, corresponding to a single maximum
of a unimodal distribution. This is true for ANY hill coefficient m. Basal bursts are large
and noisy enough to always keep the promoter ON.
When ItCA< 1, the derivative can change from negative to positive, and then back to
negative, corresponding to two maxima and a bimodal distribution. Intuitively, the
transition from negative to positive occurs at a value of
i
when the middle term representing
the activator-dependent burst frequency regulation gets larger than the second term, but
the third term is still not large comparatively. To derive conditions under which this occurs,
we set the bracketed term equal to zero and rewrite it as a polynomial in
2-9
where
L =vfA -1
(L+G+~\)i"'"
and G
-1.
).i"' +
(L+
i:
,)(L+G)^-(k;"(L
This was used to generate the plots in Figure 2-2.
Roots for this polynomial can be found numerically for any particular set of parameters. For
better understanding, we explicitly derive conditions for bimodality in two extreme cases,
m =1 and m-iOc.
With m = 1, explicitly assuming bursty expression
i-10
(G)x 2 + (GK
G >>L+
-(L + 2))x -(KA
simplifies 2-9 to:
)(L)
This quadratic equation can be solved and has two real roots (corresponding to a bimodal
expression profile) when the discriminant is positive. The discriminant is:
s~~n
(Gk
GK,2
2
2
-2GK(
~~~2~
+)(
-1)2 +qA(24GK, )+ CA4
39
And recognizing GK.
K
=
2
K2 <2(1-62)+1-
K2
22(2-
22
1-Z82+/0 -
is the condition for bimodal expression (provided
(+1))
0 (2-ke 2 ,(l 0
Zop 2<1).
or
+1))
For a given burst frequency,
bimodal expression occurs when the burst size is large enough compared to the promoter
threshold. While 2-12 yields bimodal expression for larger burst sizes, or stronger feedback
(via a smaller effective promoter threshold), the OFF peak at
= 0 becomes vanishingly
small. Therefore, there is a feedback strength where the response is realistically unimodal
(all ON) which we operationally defined by at least 5% of mass in each of the ON and OFF
population and at least 5 mRNA counts of separation between the ON and OFF peaks.
For the case of m--)Oc,
we return to the bracketed term in 2-8 which can now be
simplified to:
L
2-13
-
(
+
-
K) +1-
where 8(z) is the Heaviside step function, which is zero for z <1 and unity for z
f>>1
and
1. With
CAe<1, the first and middle terms are always negative. Hence the derivative can
only have a zero if the middle term is large enough that for some intermediate value of
K
, 2-13 is positive. In this case it must have two zeros because for large enough
ii the
third term will dominate and the derivative will become negative again. Thus the condition
for bimodality is:
2-i
(+
-
K .<
1+
40
) 1
-
Or
&npJ'>jn
1
K2
Equation 2-14 is very similar to 2-12. There is a wider range of bimodality because the
radical term in Eq. 2-12 is not present in 2-14; hence for any given burst frequency, the right
hand side is smaller in 2-14. The effect of the first term on the right hand side of Eq. 2-14
is negligible in the bursting limit. While intermediate values of m lead to more complicated
conditions, they tend to increase the range of bimodality from the lower limit of Eq. 2-12 to
the upper limit of 2-14.
For burst size regulation, we begin by taking the derivative of Eq. 2-6:
dpS
Pb,
dx
1
____
Lx
-___+
1+Gi/K,)" 1-x
__-___o
+
-
The first term dominates for small jand can be positive or negative depending on
(where for
7
7
<1 there will be a peak at c= 0). The third term is always negative since the
promoter inactivation rate remains fast even at the maximum burst size (6jo
-lis always
greater than zero). The middle term is always negative but goes to zero for large x.
Therefore, for
7
<1, all 3 terms are negative and the distribution is peaked at
i
= 0 and
monotonically decreases.
For
7
>1, bimodality is possible if the sum of the three terms changes signs from positive
to negative to positive to negative. This occurs when for some x < K. the sum changes from
positive to negative as the middle term gets larger; then for some
i
> K. the sum changes
back to positive as the middle term gets smaller; and then finally the sum becomes negative
as the last term dominates. As in the case of burst frequency regulation, we can set the
bracketed term to zero and rewrite as a polynomial in x:
-16 [(1 -)+
(I -Ifo,
"
1-Zs
Z"(141
+ (1 - fs
o)2-(a)1-Z
A bimodal distribution is possible only when 1 -
< 0 and Eq. 2-16 has 3 positive real
roots. For better understanding, we explicitly derive conditions for bimodality in two
extreme cases, n =1 andn -f+oo.
With a Hill coefficient of 1, we rewriting Eq. 2-16 for the case of n
2-17
[(l- )+(1
]i2
[(+
(Z" -1)((l-
)+(I- _o,
=1:
- f))]i-[(Z")(1-2)]
Bursty expression implies a short-lived active promoter state, even in the presence of
saturating amounts of transcriptional activator. Therefore,
the fact
1 -t
(1-06,)<0. Combined with
< 0 , the quadratic coefficient is always negative and the constant coefficient
1, the linear coefficient is always negative.
is always positive. It is also clear that for"
For K" < , the linear coefficient is also always negative, provided y0 >>A,
a condition of
bursty expression. To see this, the linear coefficient in Eq. 2-17 can be rewritten as
(K" -
1)(-A)
-O
+ (K, -f70,).
The first term of this expression is positive but always
strictly less than f and the third term is always negative; therefore the linear coefficient
is always negative. Then, by Des Cartes rule of signs, there is only ever one positive real
root under bursty conditions which implies that there is always a unimodal distribution.In
other words, in contrast to pure burst frequency regulation, pure burst size regulation will
never yield a bimodal response with a Hill coefficient of 1.
For the case of
nf-of,
we return to the bracketed term in 2-15 which can now be
simplified to:
-
v_^
K,) + -Er
o]
0(z) is again the Heaviside step function. In the relevant case of 1 - t < 0
(1-
0 6,)
and
<0, the derivative is always positive for iclose to 0 and negative for iclose to 1.
The condition for bimodal expression then is that the derivative changes signs from negative
42
i
to positive at some intermediate
K. where the middle term in Eq. 2-18 drops out.
=
Formally, this translates into the following two conditions:
A-1 f
-
-19~-
K,
-70<0
E -<
- +
1- K,
1K-K
r -+
7Yo
1-K,
K,
-
>0
The two conditions can be combined to yield the range of
(over
which bimodal
expression occurs:
2-1
2-20
+1+
+-
<K <
e(1
E,)
'
2-1
-1+s
+
-1
Next we compare the stochastic and deterministic range of bimodality. In the
deterministic case, the differential equation describing positive feedback in terms of the
microscopic rate constants used here is:
dx
dt
2(x)
A(x)+y(x)
where we make explicit that A
and )/ are potentially functions of
Xdepending on
whether burst size or frequency is being regulated. (The denominator can be simplified for
bursty expression since y >>
2 , but we will treat this general case).
To find the fixed points of 2-21, we set the rhs equal to zero. Rewriting in terms of the
rescaled constants:
~0 ) ?+
(^) + A(0
43
We want to make an explicit comparison of the range of bimodality in terms of
parameters for the deterministic case versus the stochastic case. For nm = 1, we already
know that Eq. 2-22 only yields a single fixed point and no deterministic bistability is possible.
For
m, n --> 0 the range of bistability is easily calculated. For example, for M-)OC, Eq. 2-22
can be rewritten as:
Z
+
+0(i
+oS
< k7))
which is an explicit expression for the fixed points
i
that can be evaluated for
i
greater
or less thanK . Bistability occurs when the calculated fixed point is consistent with it being
greater or less than K. This leads to the following range of bistability for burst frequency,
Eq. 2-24, and burst size regulation, Eq. 2-25:
2-24~
< K, <
<
-K
-
The deterministic range of bistability of equations 2-24 and 2-25 can be directly
compared to the stochastic range of bimodality of equations 2-14 and 2-20. For both burst
size and burst frequency regulation, the upper limit of the stochastic range of bimodality is
less than its deterministic counterpart, but approaches it for large burst frequency and small
burst size. Near this upper limit, the stable steady-states are close (a pitchfork bifurcation
occurs at the upper limit) and so noise in expression can easily promote transitions, erasing
any distinction.
The lower limit of the stochastic range of bimodality is more complicated. For burst
frequency regulation, while there is in principle no lower limit (in the continuum
approximation for protein levels) we have operationally defined it as when less than 5% of
the probability mass remains in the peak corresponding to little or no protein. The result
with this definition leads to a stochastic lower limit that is slightly lower than the
deterministic lower limit. For burst size regulation, comparison of Eq. 2-20 with 2-25 reveals
44
that for smaller 2[ , the lower limit of the stochastic range of bimodality is significantly
(easily found by equating
lower than the deterministic lower limit. There is particular a I
the lower limits of Eq. 2-20 and 2-25) at which this relationship flips. However, at this higher
value of f , the burst approximation breaks down. With the operational lower limit set by
the vanishing OFF population, the stochastic range is shrunk relative to the deterministic
range, as supported by the numerical solution in Figure 2-2.
2.3.3
Two modes of regulating burst size are equivalent in the bursting limit
In the previous discussion, burst size regulation has been modeled as a change in). In
the burst limit, either
for
Ci could be regulated and there should be no effect on expression.
To show this explicitly, consider an alternative to Eq. 2-4 that is similar to 2-3:
(=I,++ +0 (KUI
2-,,
)" -
. For illustrative purposes, we focus on the case
For now, we do not normalize A by
of m=1, although similar arguments can be made for arbitrary m. With this regulation, the
>> A and
ODE in Eq. 2-2 can be solved. After applying the bursty assumption
rearranging:
K
2-27
Pb5,(x)
Equation
(0 +
f
2-27
K
=C(
W
" +
(f
x)
is identical
> K
f
/
)-
xE
(1+ep)ep
to
,and,
.
-+
'
with
2-17,
These
~-1
the
y
()-
I--,3
+) e,
following
1+-%
change
transformations
are
of v ariables
due
1+
reciprocal relationship between in the effect of changing fi and
45
F on
burst size.
to
the
2.4
Results:
2.4.1
Bimodal expression patterns associated with positive feedback loops are
enhanced with burst frequency regulation but reduced with burst size
regulation
To compare pure burst size and burst frequency regulation in a mathematically
controlled manner we require an identical dependence of the mean level of gene expression
on a transcriptional regulator for both forms of regulation. Therefore, both the mean basal
level and fold change in expression is equal for both cases, which constrains other parameters
(Table 2-1). However, the expression distribution is very different, and the earlier described
scaling relationships of noise on abundance are apparent (Figure 2-1). To focus on the
differences between the two types of regulation, we neglect extrinsic noise (plasmid copy
number, global expression capacity, etc.).
Table 2-1: Parametersselected for controlled comparison of burst fiequency and size regulation
Burst
frequency
regulation
Burst size
regulation
Foldchange in
expression
Range of
mean
expression
X0
[mRNA /
[mRNA/
[mRNA]
minute]
mRNA
41
[1.25,50]
500
5
41
[1.25,50]
500
1.25
lifetime]
Frequency regulation
Size regulation
3
C
.0 40
W
LD
Yo
[mRNA/
1-]
minute]
.025
50
488
ey
.025
Frequency regulation
Size regulation
2
0.
X
W
C
CD
T 20Z "
LF
1
n
0
105
100
Input Signal
20
40
Mean Expression
Fgvure 2-1: Despite an identical mean response to an open-loop input signaL
regulation have different expression distributions with different noise levels.
46
fiequency and size
Next, we compare the effect of burst size and burst frequency regulation on a gene within
a positive feedback loop, where the Hill coefficient in the promoter dose-response
characteristic is varied (Figure 2-2).
We calculate the distributions at different values of
the feedback strength both analytically using Equations 2-5 and 2-6 and numerically using
either a finite Markov approach (Munsky & Khammash, 2006) or the Gillespie algorithm
(Gillespie, 1977). Feedback strength is varied by changing K, the threshold level of
regulator which leads to half-maximal activation. We define the regime of bimodality as the
range of K over which one observes a bimodal distribution with at least 5% of the
distribution lying in the ON or OFF state.
As has been observed earlier theoretically
(Friedman, Cai & Xie, 2006; Karmakar & Bose, 2007; Samoilov, Plyasunov & Arkin, 2005)
and experimentally (To & Maheshri, 2010), with burst frequency regulation at intermediate
feedback strengths one observes a bimodal distribution even with a Hill coefficient of 1. In
general, burst frequency regulation widens the range of feedback strengths over which a
bimodal expression is observed, with the extent of widening diminishing with an increased
Hill coefficient of the autoregulatory response. In contrast, burst size regulation destabilizes
the bistability and decreases the range of feedback strengths over which bimodal expression
is observed. With a Hill coefficient < 1 burst size regulation never resultsin bimodalresponse.
Why does burst frequency regulation result in bimodal expression over a larger range of
feedback
strengths? Without feedback,
bursty gene expression results in a gamma
distribution of mRNA or protein at steady-state (Peccoud & Ycart, 1995; Raj et a]., 2006;
Shahrezaei, Ollivier & Swain, 2008). If burst frequency is less than 1 a burst occurs less
frequently than the mRNA or protein lifetime, a significant fraction of cells have no mRNA
or protein, and the distribution is peaked at zero and monotonically decreasing. Burst
frequencies greater than 1 result in a distribution peaked at a non-zero value. When burst
frequency is regulated, a bimodal expression distribution with peaks at a zero and non-zero
value can be observed. This occurs when the positive feedback loop samples burst frequencies
across a range spanning a burst frequency of 1. Given the basal burst frequency set here is
less than 1, this always happens for large enough feedback strength. In contrast, when burst
size is regulated the burst frequency is set to some fixed quantity while the positive feedback
loop samples various burst sizes. Increasing burst size increases the mean level of expression,
47
but it does so by widening the distribution rather than shifting it from an OFF to ON peak.
With higher Hill coefficients, the underlying deterministic bistability results in a bimodal
expression profile, but this is destabilized by burst size regulation (Figure 2-2, right).
Burst size regulation
Burst frequency regulation
Region of bimodality
Noncooperative
(n=1)
M
2
Stochastic,
mRNA per cell Frequency regulation!
----------- Deterministic Stochastic,
Size regulati Cn
mRNA per cell
----e-----
Cooperative
(n>1=4)
.0
mRNA per cell
mRNA per cell
Figure 2-2: Qualitative differences in population variability due to positive feedback loops
depending on burst size and burst frequency regulation. Sinulation of positive feedback with burst
-iequencyregulation (left) and burst size regulation (right)shows that burst frequency regulation can
create Iiiodality in the absence of deterministic bistability whereas burst size regulation cannot
(top). In the presence of deterministic bistability (bottom), burst frequency stabilizes the bistability
and extends the regime of bimodality, whereas burst size regulation destabilizes the bistability and
reduces the bimodal regimne. Colored plots are expression distributions at four different expression
levels, modulated via feedback strength. simulated with Finite Markov Chain. Center panels are
summary of the range of feedback strengths for which the system has bimodal expression.
48
2.4.2
Mode of regulation affects mean expression
These qualitative differences between burst size and frequency regulation affect the mean
expression level. The bimodal expression profile generated with burst frequency regulation
results in a mean expression that is less than the deterministic expectation at intermediate
feedback strengths (Figure 2-3, left, blue). It is the low expression peak that decreases the
mean. The extent of this decrease will increase for lower basal burst frequencies. This is
intimately connected to the fact that lower basal burst frequencies lead to a larger range of
feedback strength over which a bimodal profile is observed. For burst size regulation, the
mean of the unimodal distribution is larger than the deterministic expectation, increasing
the mean expression (Figure 2-3, left, red). Again, the extent of this difference is magnified
with lower burst frequencies and hence higher basal burst sizes. Even at low feedback
strengths, there is enough sampling of higher burst sizes to increase the mean expression.
Sampling of higher burst sizes at low feedback also increases the noise in expression at
intermediate levels (Figure 2-3, right).
Frequency regulation
Size regulation
-
-
3
.2 40
Frequency regulation
Size regulation
2
X
w
-
0
z
20
-
1-
0
0
100
102
104
Feedback Strength (1/K)
0
40
20
Mean Expression
Figure 2-3: Aean expression in fredback ihr stochastic burst frequency or size regulation versus
line). (Blue) The zero-peak of bimodal expression with burst frequency
regulation decreases its mean expression relative to deterministic expectation. (Red) Size regulation
samples higher burst sizes at lower kfedback strengths, increasing the unimodal population mean
slighttv above the deterministic expectation (left) and also population noise (right).
determintic (red dotted
49
2.5
Discussion
While these results assume a model of coupled transcription and translation, they are
exact when the ratio of the mRNA to protein degradation rates are very high or low. When
the protein is more stable than mRNA, the burst frequency should normalize the promoter
activation rate by the protein lifetime, whereas the mRNA lifetime is appropriate if the
protein is unstable. The former condition is true for most proteins, and long-lived proteins
can time-average transcriptional bursts resulting in relatively high protein burst frequencies.
However, transcriptional activators and repressors are often unstable (Belle et aL, 2006),
which allows genes to turn ON and OFF quickly, and protein lifetimes can be short in
rapidly growing microbes due to dilution.
The results presented here are limited to a two-state promoter model under conditions
where a continuum approximation is appropriate for mRNA.
Clearly, promoters have
complex architectures and transitions between multiple states (Hahn, 2004), leading to a
more complex dependence of noise on trans factors and expression level (Sanchez et al.,
2011). Nevertheless, the qualitative effects of burst size and burst frequency regulation
described here should hold and contribute to our understanding of what these additional
complexities add. In reality, genes do not follow the well-parameterized continuous
transcriptional bursting of the popular model, but some genes will behave closer to it than
others, often enough to extract useful insight from the model. The following chapters show
that our gene of interest has cell-cycle dependent active periods rather than bursting. Yet
the bursting framework is still useful for understanding active/inactive dynamics and
predicting consequences for behavior in gene networks.
50
2.6
References
Acar, M., Becskei, A., & van Oudenaarden, A. (2005). Enhancement of cellular memory
by reducing stochastic transitions. Nature, 435(7039), 228-232.
Bar-Even, A., Paulsson, J., Maheshri, N., Carmi, M., O'Shea, E., Pilpel, Y., & Barkai, N.
(2006). Noise in protein expression scales with natural protein abundance. Nature
Genetics, 38(6), 636-643.
Belle, A., Tanay, A., Bitincka, L., Shamir, R., & O'Shea, E. K. (2006). Quantification of
protein half-lives in the budding yeast proteome. Proceedingsof the National
Academy of Sciences, 103(35), 13004-13009.
Blake, W. J., Bal zsi, G., Kohanski, M. A., Isaacs, F. J., Murphy, K. F., Kuang, Y.,
Collins, J. J. (2006). Phenotypic consequences of promoter-mediated transcriptional
noise. Molecular Cell, 24(6), 853-865.
agatay, T., Turcotte, M., Elowitz, M. B., Garcia-Ojalvo, J., & SOel, G. M. (2009).
Architecture-dependent noise discriminates functionally analogous differentiation
circuits. Cell, 139(3), 512-522.
Cai, L., Friedman, N., & Xie, X. S. (2006). Stochastic protein expression in individual cells
at the single molecule level. Nature, 440(7082), 358-362.
Choi, P. J., Cai, L., Frieda, K., & Xie, X. S. (2008). A stochastic single-molecule event
triggers phenotype switching of a bacterial cell. Science, 322(5900), 442-446.
Chubb, J. R., Trcek, T., Shenoy, S. M., & Singer, R. H. (2006). Transcriptional pulsing of
a developmental gene. Current Biology, 16(10), 1018-1025.
Elowitz, M. B., Levine, A. J., Siggia, E. D., & Swain, P. S. (2002). Stochastic gene
expression in a single cell. Science, 297(5584), 1183-1186.
51
Friedman, N., Cai, L., & Xie, X. S. (2006). Linking stochastic dynamics to population
distribution: An analytical framework of gene expression. PhysicalReview Letters,
97(16), 168302.
Gillespie, D. T. (1977). Exact stochastic simulation of coupled chemical reactions. The
Journalof Physical Chemistry, 81(25), 2340-2361.
Golding, I., Paulsson, J., Zawilski, S. M., & Cox, E. C. (2005). Real-time kinetics of gene
activity in individual bacteria. Cell, 123(6), 1025-1036.
Hahn, S. (1998). Activation and the role of reinitiation in the control of transcription by
RNA polymerase II. Cold Spring HarborSymposia on QuantitativeBiology, 63, 181188.
Hahn, S. (2004). Structure and mechanism of the RNA polymerase II transcription
machinery. Nat Struct Mol Biol, 11(5), 394-403.
Holland, M. J. (2002). Transcript abundance in yeast varies over six orders of magnitude.
Journal of Biological Chemistry, 277(17), 14363-14366.
Hornos, J. E. M., Schultz, D., Innocentini, G. C. P., Wang, J., Walczak, A. M., Onuchic,
J. N., & Wolynes, P. G. (2005). Self-regulating gene: An exact solution. Physical
Review E, 72(5), 051907.
Kaern, M., Elston, T. C., Blake, W. J., & Collins, J. J. (2005). Stochasticity in gene
expression: From theories to phenotypes. Nature Reviews. Genetics, 6(6), 451-464.
Karmakar, R., & Bose, I. (2007). Positive feedback, stochasticity and genetic competence.
Physical Biology, 4(1) 29.
Maamar, H., Raj, A., & Dubnau, D. (2007). Noise in gene expression determines cell fate
in bacillus subtilis. Science, 317(5837), 526-529.
52
Maheshri, N., & O'Shea, E. K. (2007). Living with noisy genes: How cells function reliably
with inherent variability in gene expression. Annual Review of Biophysics and
Biomolecular Structure, 36(1), 413-434.
McAdams, H. H., & Arkin, A. (1999). It's a noisy business! Genetic regulation at the
nanomolar scale. Trends in Genetics, 15(2), 65-69.
McAdams, H., & Arkin, A. (1997). Stochastic mechanisms in gene expression. Proceedings
of the NationalAcademy of Sciences, 94(3), 814-819.
Mogno, I., Vallania, F., Mitra, R. D., & Cohen, B. A. (2010). TATA is a modular
component of synthetic promoters. Genome Research, 20(10), 1391-1397.
Munsky, B., & Khammash, M. (2006). The finite state projection algorithm for the
solution of the chemical master equation. The Journalof Chemical Physics, 124(4),
Newman, J. R. S., Ghaemmaghami, S., Ihmels, J., Breslow, D. K., Noble, M., DeRisi, J.
L., & Weissman, J. S. (2006). Single-cell proteomic analysis of S. cerevisiae reveals
the architecture of biological noise. Nature, 441(7095), 840-846.
Paulsson, J., & Ehrenberg, M. (2000). Random signal fluctuations can reduce random
fluctuations in regulated components of chemical regulatory networks. Physical
Review Letters, 84(23), 5447.
Peccoud, J., & Ycart, B. (1995). Markovian modeling of gene-product synthesis.
TheoreticalPopulation Biology, 48(2), 222-234.
Pedraza, J. M., & van Oudenaarden, A. (2005). Noise propagation in gene networks.
Science, 307(5717), 1965-1969.
Raj, A., Peskin, C., Tranchina, D., Vargas, D., & Tyagi, S. (2006). Stochastic mRNA
synthesis in mammalian cells. PLoSBiol, 4(10), e309-e309.
53
-.
Raj, A., & van Oudenaarden, A. (2009). Single-molecule approaches to stochastic gene
expression. Annual Review of Biophysics, 38(1), 255-270.
Raj, A., & van Oudenaarden, A. (2008). Nature, nurture, or chance: Stochastic gene
expression and its consequences. Cell, 135(2), 216-226.
Raser, J. M., & O'Shea, E. K. (2004). Control of stochasticity in eukaryotic gene
expression. Science, 304(5678), 1811-1814.
Rosenfeld, N., Young, J. W., Alon, U., Swain, P. S., & Elowitz, M. B. (2005). Gene
regulation at the single-cell level. Science, 307(5717), 1962-1965.
Samoilov, M., Plyasunov, S., & Arkin, A. P. (2005). Stochastic amplification and signaling
in enzymatic futile cycles through noise-induced bistability with oscillations.
Proceedings of the National Academy of Sciences of the United States of America,
102(7), 2310-2315.
Sanchez, A., Garcia, H. G., Jones, D., Phillips, R., & Kondev, J. (2011). Effect of
promoter architecture on the cell-to-cell variability in gene expression. PLoS Comput
Bio, 7(3), e1001100.
Shahrezaei, V., Ollivier, J. F., & Swain, P. S. (2008). Colored extrinsic fluctuations and
stochastic gene expression. Mol Syst Biol, 4(198).
So, L., Ghosh, A., Zong, C., Sepulveda, L. A., Segev, R., & Golding, I. (2011). General
properties of transcriptional time series in escherichia coli. Nature Genetics, 43(6),
554-560.
Suel, G. M., Garcia-Ojalvo, J., Liberman, L. M., & Elowitz, M. B. (2006). An excitable
gene regulatory circuit induces transient cellular differentiation. Nature, 440(7083),
545-550.
54
Suter, D. M., Molina, N., Gatfield, D., Schneider, K., Schibler, U., & Naef, F. (2011).
Mammalian genes are transcribed with widely different bursting kinetics. Science,
332(6028), 472-474.
Taniguchi, Y., Choi, P. J., Li, G., Chen, H., Babu, M., Hearn, J., Xie, X. S. (2010).
Quantifying E. coli proteome and transcriptome with single-molecule sensitivity in
single cells. Science, 329(5991), 533-538.
Thattai, M., & van Oudenaarden, A. (2001). Intrinsic noise in gene regulatory networks.
Proceedings of the NationalAcademy of Sciences of the United States of America,
98(15), 8614-8619.
To, T., & Maheshri, N. (2010). Noise can induce bimodality in positive transcriptional
feedback loops without bistability. Science, 327(5969), 1142-1145.
Turcotte, M., Garcia-Ojalvo, J., & SOel, G. M. (2008). A genetic timer through noiseinduced stabilization of an unstable state. Proceedingsof the NationalAcademy of
Sciences, 105(41), 15732-15737.
Zenklusen, D., Larson, D. R., & Singer, R. H. (2008). Single-RNA counting reveals
alternative modes of gene expression in yeast. Nat Struct Mol Biol, 15(12), 1263-1271.
55
CHAPTER 3.
The cell-cycle dependence of transcription is a
dominant source of noise in gene expression 1
3.1
Abstract
The large variability in mRNA and protein levels found from both static and temporal
measurements in single cells has been largely attributed to random periods of transcription,
often occurring in bursts. The cell cycle has a pronounced global role in affecting
transcriptional and translational output, but how this influences transcriptional statistics
from noisy promoters is unknown and generally ignored by current stochastic models.
Here we show that variable transcription from the synthetic tetO promoter in S.
cerevisiae is dominated by its dependence on the cell cycle. Real-time measurements of
fluorescent protein at high expression levels indicate tetO promoters increase transcription
rate -2-fold in S/G2/M similar to constitutive genes. At low expression levels, where tetO
promoters are thought to generate infrequent bursts of transcription (Raj et a]., 2006; To &
Maheshri, 2010), we observe random pulses of expression restricted to S/G2/M, which are
correlated between homologous promoters present in the same cell. The analysis of static,
single-cell mRNA measurements at different points along the cell cycle corroborates these
findings. Our results demonstrate that highly variable mRNA distributions in yeast are not
solely the result of randomly switching between periods of active and inactive gene
expression, but instead largely driven by differences in transcriptional activity between GI
and S/G2/M.
3.2
Introduction
At the single-cell level, mRNA and protein levels of regulable genes are often found to
be highly variable (Newman et a]., 2006; Raj & van Oudenaarden, 2009; Taniguchi et a.,
2010). The resulting long-tailed mRNA and protein distributions are well-described by
stochastic models (Peccoud & Yeart, 1995; Raj et al., 2006; Shahrezaei, Ollivier & Swain,
I
Some text and figures are taken from Zopf, Quinn, Zeidman & Maheshri, 2013.
56
2008; Taniguchi et a]., 2010) of transcriptional bursting, where a promoter undergoes
random and intermittent periods of highly active transcription. Real-time observations of
transcription in multiple organisms appear consistent with this behavior (Choi et a]., 2008;
Chubb et aL., 2006; Golding et a]., 2005; Larson et a]., 2011; Maiuri et a]., 2011; Muramoto
et a., 2012; Taniguchi et a]., 2010; Suter et a]., 2011). Thus, both static and temporal views
attribute much of the observed mRNA variability to the stochastic nature of reactions
intrinsic to transcription. Consequently, the standard stochastic model of gene expression
has been widely used to infer steady-state dynamics (Mao et aL, 2010; Munsky, Neuert &
van Oudenaarden, 2012; Raj et a]., 2006; Tan & van Oudenaarden; To & Maheshri, 2010).
However, earlier studies examining the origin of variability in protein expression found
such variability is not solely due to stochasticity in reactions intrinsic to gene expression,
but also extrinsic factors. These studies looked for correlations in expression between
identical copies of one promoter (Elowitz et a]., 2002; Raser & O'Shea, 2004; Volfson et a].,
2006) and/or between that promoter and a global or pathway-specific gene (Colman-Lerner
et al., 2005; Pedraza & van Oudenaarden, 2005). Not only is the importance of extrinsic
factors clear, without time-series measurements the intrinsic noise measured by these
techniques may not completely be ascribed to stochastic reactions in gene expression
(Hilfinger & Paulsson, 2011). While global extrinsic factors have been suggested to largely
impact translation (Raj et al., 2006), their influence on transcription and transcriptional
bursting is unclear. Numerical analysis indicates that an expression distribution well-fit by
the solution to the bursting model does not mean variability arises from bursting.
This cell-cycle dependence of transcription has gone largely unnoticed but, to some
degree, should be expected. The cell cycle is known to have global effects on total protein
and RNA synthesis that should play a role in transcription (Larson et al., 2011; Trcek et
al., 2011; Volfson et al., 2006). However, with few exceptions (Volfson et al., 2006), most
models of gene regulation do not account for cell cycle variability. Using both static single
molecule mRNA and dynamic real-time protein measurements in single cells, we show that
much of the variability in a synthetic tetO promoter typical of noisy genes in yeast is driven
by differences in transcription rate between G1 and S/G2/M.
57
3.3
Results:
3.3.1
Multiple transcription patterns result in expression distributions
consistent with transcriptional bursting
We measure noise in gene expression from the constitutive DOA1 promoter
(Po0A)
and
the regulated tetO promoter with 1 (Pixteto) or 7 (P7 x,,to) activator binding sites. tetO is a
synthetic inducible gene regulation system, originally developed for studying gene regulation
in mammalian cells (Gossen & Bujard, 1992), then transferred to yeast (Gari et a]., 1997).
It reversibly expresses a reporter gene in response to its activator tetracycline transactivator
(tTA), which is a fusion of the tetracycline repressor (TetR) from E coli and the activation
domain VP16 from Herpes Simplex Virus. tTA's binding activity is controlled by derivatives
of the tetracycline antibiotic, such as doxycycline. tTA binds to a specific tetO operator
sequence, one or more copies of which are located upstream of a minimal promoter (CYC]
in this study) (Figure 3-1).
Dox
9f
NxtetO
Figure 3-1: Simple depiction of the tetO promoter and activator.N copies of the tTA binding
site. tetO, are inserted upstream of the minimal CYCI promoter. Dox modulates tTA activity. This
enables controlled expression of a gene of interest (the vYFP fluorescent reporter in this stuldy).
To begin to understand the origins of noise, we first consider the standard stochastic
model of gene expression, which describes promoter fluctuations between two states, ON
and OFF, with exponential waiting times in both states. Conditions of transcriptional
bursting occur when the promoter rarely transitions to the ON state (with some burst
frequency) and spends a short but productive period producing mRNA (with some burst
size). This process is predicted to yield a negative binomial distribution of mRNA at
stationary conditions (Paulsson & Ehrenberg, 2000). Conditions of bursting are expected for
noisy, regulated genes; constitutive genes are expected to be expressed with a "burst size" of
one, equating to Poisson expression statistics. We measure cytoplasmic mRNA distributions
from P 1 xteo and P 7,,,,o without tTA present at basal conditions (Figure 3-2 A, B), with
58
intermediate levels of tTA (Figure 3-2 C, D) and from
PDOA1,
at two expression levels titrated
by growth phase ((Figure 3-2 E, F). These expression distributions agree well with the
expectations of the bursting model (Figure 3-2, black). tetO expression distributions are
well-fit by the negative binomial distribution and the inferred burst sizes range from 5-8
mRNA for Pxtet)o and 9-10 for
Expression from
PDOA1
P7,tetO
mRNA across basal and intermediate expression levels.
leads to Poisson-like distributions, with a burst size between 1 and 2
mRNA (Table 3-1). These burst parameters are consistent with previous measurements (To
& Maheshri, 2010; Zenklusen, Larsen & Singer, 2008).
A
0.6
Total
population
0.3
C
B
N=136
0.4
N=237
N=341
..
0.2.00101
WtTA
-
.
F
-dox E
+:dox D
MTA
-
N=302
-
20.
N=560
-
N=454
0.3
Figure 3-2: Cytoplasmnic mRNA expression distributions for Pi
1 -O and PaO without activator
(A, B), Pi,,o and Ps,.,O with intermediate levels of activator (C, D) and two levels of PIX)OI
expression. The dot and horizontal lines above the distribution represent the mean and standard
dei ation of the distribution.Error bars show the sampling errorfrom bootstrapping. The expression
distributions are well-fit by a negative binomial distribution with moderate burst sizes fi)r P 1 ,711,10
and a very Jov buist size of 1-2 mRNA for Po.I: (black).
But we found signs of extrinsic noise in this expression variability, leading us to doubt
the applicability of the bursting model to explain the transcription dynamics underlying the
expression distributions. The bursting parameters inferred across a wide range of activator
levels indicate a biphasic trend in bursting dynamics at the tetO promoters (data shown
and revisited later in Figure 4-12). But we show with kinetic Monte Carlo (Gillespie, 1977)
simulations that several transcription patterns produce the negative binomial solution of the
two-state promoter model. Some even recreate the observed biphasic bursting dynamics
despite no underlying biphasic origin. Figure 3-5 shows this for: a combination of small
activator-independent basal bursts and large bursts whose frequency increases with activator
(Figure 3-5A); transcriptional bursting with a maximum burst size (rather than an infinite
gamma distribution of burst sizes) (Figure 3-5B); and bursting with an activator-regulated
59
frequency that is also modulated by the cell cycle (Figure 3-5C). This is an example of a
common error where a model does indeed fit the data, but is not actually a valid
representation of the data's origins.
Large activator-stabilized bursts
Original
Truncated
D
mRNA
,,,
enN
gene
0
40
Small basal bursts
G1
80
30
aeuu
*20
20
A
e
-
UU
*
e
10
10.
0
M
name a s.......
20
0
G2
30
30
s
S
Burst size
0.5
1
1.5
2
Burst Frequency
0
B
0.5
1
1.5
2
Burst Frequency
00
0.5
2
1.5
1
Burst Frequency
Figure3-3: Biologicalv plausible transcrjptionpatterns whose noise recreatesthe biphasicpattern
we observed experimentaly. Top: Schenatic of transcriptionpattern. Bottom: Black: the piaram1Jeters
used to simulate transcription; blue: the parameters infrred using the negative binomial lit.
Hypothetical transcriptionpatterns are: (a) a combination oflsmall basal bIursts and large activated
bursts. (b) a maximum sampled blurst size of 40, (c) burst frequency regulation where the burst
iquen(y halves for one-quarter of the cell cycle.
3.3.2
Static mRNA FISH reveals cell-cycle dependent expression may create
extrinsic noise in expression
Furthermore, we measured high positive covariance (p = 0.3-0.7) of mRNA expression
from identical genes at homologous loci of a diploid cell (constructed for this purpose). This
was clear evidence of extrinsic noise, prompting further investigation of the source of noise.
Figure 3-4 shows a sample mRNA FISH image from three of the conditions above.
expression (F) shows the expected low-level, tightly-distributed expression.
P/ 7 xt~eo
POAI
(A, D)
show the expected "noisy" expression, with most cells having few mRNA but some filled
with many mRNA. But closer inspection suggests an interesting trend: the highestexpressing cells in bursty regime are all in S/G2/M (colored orange and red here). But no
conclusions can be made without further quantitative analysis.
60
A
+D..
F
S/G2/M, SMALL BUD
S/G2/M, LARGER BUD
Figure 3-4: Aiicrographs of cells from three of the samples with (top) classification of cell-qycle
stage as GJ (yellow). earlv-S/G2 (orange) or later S/G2/Ml (red) and (bottom) the maximum
projection of eight images fluorescent rhodamine staining within a Z-stack. I'Ve clearky see highly
, (B) and, less so, P.,-,() (A) but tight expression from P 0o. (C). The
variable expression from
cases of highest expression from Pt,.o and P-rto, are S/G21/,[ cells.
61
To investigate whether mRNA number has cell-cycle dependence, we classify cells by
whether they are in G1 or the size of their bud in S/G2/M (Figure 3-5). A cell develops a
bud at the G1/S transition, which then grows in size until it reaches approximately 60% of
the cross-sectional area of the mother before budding off as a daughter cell at mitosis (M).
Cell-cycle identification is assisted by staining the nucleus with DAPI, by indicating whether
the nucleus has split into two (Figure 6-1B). We classify G1 and three stages of S/G2/M
based on increasing absolute bud size (Figure 3-5).
Gi
S
G2
M
Figure 3-5 Cell-c ycle stage is classified visually. Budding cells are classed as early, mid or late
S/G2/ il according to bud size. This creates a pseudo-temporal cell-cycle profile.
Without any further analysis, direct measurements of mRNA expression in these four
cell-cycle stages clearly show cell-cycle dependent expression (Figure 3-6).
PoAl
expression
increases from GI through increasing bud size in S/G2/M, but only about 2-fold, based on
the distributions' means. But the change in basal expression from GI through increasing
bud size is remarkable: mean expression increases several-fold, but the most obvious effect
is that the percentage of cells without mRNA which drops from 50% to 5% for Pltct(o (A).
Intermediate expression (C, D) appears somewhere between the two. Our observations that
mRNA profiles vary across the cell cycle clearly contradict the current bursting model of
expression. But these expression distributions contain a history of transcription over the
lifetime of the mRNA reporters, making the underlying transcription rate changes nonobvious. The distribution in each cell-cycle stage contains noise, which can't be directly
interpreted. For both reasons, we require a model of mRNA production and degradation
across the cell-cycle to further interpret the results.
62
A
No
bud
0
Early
bud
Small
bud
S
0.6
N=791
C
-
0.4
50.6
0
0.4
~0.3
0.2
0
0.6
0
N=18
0.4
0.3
0.2
N=22
~
0
N=1251
N=41
~
D
N=191
.:dox
+ WtTA
E
F
~N=178
N=388
0 .3
0 6AMi 11
N=53
N=40
~
10 20 30
.05
N=47
0.2
02
'N=42
0.1
N4
H
N=7
00.
0
A
20
40
0
20
08 20 40 60
mRNA Count
40
N=61
N=282
-
N=58
N=67
0.6
03
0.1
0
0.2
0.1
0
N=53|
0.6
0.3
01
-0.1
i
.05
L
0.1
N=50.2
N=31
0.2
0
0.1
0.2
.051
.1
N=17
.6
0
.:doxl
+ WtTA
0.2
0.3
0
Lmare
bud
B
N=51
~
N=52
0.6
0.3
0
M
10
20
0
5
10
Figure3-6: Expression distributions of the cases in Figure 3-2 stratifiedby cell-cycle stage. Basal
expression (A. B) shows large cell-cycle dependence of expression; constitutive expression (E. F)
shows a smaller degree of cell-cycle dependent expression.
63
3.3.3
A stochastic model to infer cell-cycle dependent transcription from
mRNA expression distributions
We develop a simple model of cell-cycle transcription to assess what fold-change in
transcription across the cell-cycle is consistent with experimental data. Since transcription
rate is the parameter of interest, whereas mRNA count is observed, we use a model that
incorporates the lifetime of mRNA to link observed mRNA distributions to underlying
dynamics. We modified standard stochastic models for gene expression to incorporate cellcycle effects and see if these could describe observed mRNA distributions. The model uses
different but fixed transcription rates in G1 (k,./) and S/G2/M (k,,.):
-
3-1I
dt
ktxf/
0
f -MM
ktX -7MM
< t ! tG1
tG1
<iICJ
with f defined as the ratio of transcription in S/G2/M to G1, such that f=1 means no cell
cycle effect and f = Inf means no expression in G1. Model parameters set by experimental
observations are: a 20 min vYFP mRNA half-life (1n(2)/y,) (To & Maheshri, 2010), a 120
min cell cycle duration (t(c), and a 55 min G1 duration
(tG1),
with the latter two parameter
values varying slightly depending on the sample. We choose to simulate only Poisson
transcription, in order to see how much of the super-Poissonian noise derives from cell-cycle
effects. However the model can be adapted for bursty transcription, where kt1 , is simply the
product of the size and frequency of bursts. We use a finite state Markov approach (Munsky
& Khammash, 2006) to simulate the stochastic birth and death of mRNA across the cell
cycle. This yields the mRNA distribution as a function of cell-cycle progression. The
stationary condition is enforced such that the beginning mRNA distribution is the result of
binomial partitioning of the end mRNA distribution. If one explicitly ignored cell-cycle, this
model is well-known to yield a stationary mRNA distribution that is Poisson with mean
kx,/vI. But our solution is an oscillatory steady-state because the initial mRNA distribution
in new cells matches binomial partitioning of mRNA present in the mother plus bud right
before mitosis. While much is known about the age-dependent structure and size distribution
of yeast populations, we do not attempt to describe these details in this or other models.
These details do not have large qualitative effects and omitting them provides simplicity
without sacrificing our ability to assess the importance of cell-cycle dependent transcription.
64
3.3.4
Regulated transcription at low activator levels is restricted to S/G2;
Constitutive expression varies with gene dosage
To review our results so far, while the overall mRNA distributions exhibit excellent fits
to a negative binomial distribution predicted by the standard model (Paulsson & Ehrenberg,
2000; Raj et a]., 2006) (Figure 3-7, grey), partitioning data by cell-cycle phase clearly shows
it is incorrect. We develop a model of constant Poisson transcription with a rate that
increases by
ffold between G1 and S/G2/M. We next apply it to data to estimate the fold-
change in expression and the extent to which this causes overall expression variability.
The model has two free parameters, ktx
and
f, which dictate the magnitude of
transcription in G1 and S/G2/M. We evaluated 3 different choices: setting f=1 such that
ktxu is constant throughout the cell cycle, setting f= 2 consistent with the expected
increased due to gene dosage, and allowingf> 2. With the first two choices,
fis set and k,,
is specified such that the mean of the measured and model distributions are equivalent. For
the last choice, [and ktx are specified such that the mean of the measured and model G1
and S/G2/M distributions match. Because each choice corresponds to a different way of
modeling cell-cycle dependent transcription, we will refer these as separate models. We
evaluated the relative performance of each model by qualitative agreement of experimental
distributions over the cell cycle (Figure 3-7) and in S/G2 specifically (Figure 3-9); and
quantitatively, by evaluating a X goodness of fit (Table 3-1) and comparing the ratio of the
mean mRNA number in GI to the mean mRNA at the end of the cell cycle, defined as:
t=tGl
RM
3-2
= (M(t()) /
M(t)
This is estimated experimentally (RM ) as the ratio of the mean mRNA in cells with the
largest bud size (late G2/M) to cells in G1, and hence will be biased downward.
(Experimental RM
values are listed in Table 3-1.) This is a slightly different way of
examining the models because it ignores the early/mid S/G2 data, but it informs directly
on whether a particular model is capable of describing the relative (S/G2/M versus Gi)
increase in mRNA number across the cell-cycle. Figure 3-7 shows the cell-cycle-stage
expression distributions overlaid by the best-fit case for f = 2 and f > 2.
65
B
A
N=136
C
A
t:dox
--
ViuTA
population 3
.2
0
0
I.
N=791
4
N=302
2
~ N=191
N=560
.2
N=178
.2
N=454
-f= 12
N=388
.6
~-
N=282
00
No
N=22
Early
6
bbf=
&
=
0
N=125
E
E
I
-1
.f=100
F
-dox
+VtA
N=341 .2
N=237
.
Total
D
.4
1
N=53
N=41
N=47
.2
3
N=61
N=53
N=58
N=67
N=42
Small
bud
6
Large
bud
.6
N=18
0-
~
N=55 .2
1-N=40
.2
0
--
N=17
0
N=31
.4
10 20 30
0
Experimental data
Distnibutionmean
+1- 1 SD
20
N=4
~
- -=51
N=37
.6
N=52
0
10
20
0
5
10
0 20 40 60
40
mRNA Count
Transcriptional bursting -+- Poisson transcription, f= 2 --- Poisson transcription, f> 2
20
20
0
40
0
20
& O
!
2
Cell cycle progression
Cell cycle progression
Figiure 3-7 Large differences in transcriptionalactivity between
0
1
2
Cell cycle progression
S/G2/M
and G1 depend on
pronoter. (A) YFP mRNA distributionsin a haploid yeast with integratedPxmo- YFP ad no tTA
arc shown in a column as a function of cell-cycle phase. Horizontallines above each distribution arc
the experimental (green) and predicted meanlstandarddeviation fur diffetrenr models, calculated by
(B) As in (A) but for P HO. (C&-D) As in]
assuming each bud phase represents 1/3 of S/G2/.
(A&B) biit with tTA and 100 or 500 ng/rmL dox added for Pwqo and P;-,to, respectively. (EF,)
Integrated Poo.j- YFP with native DOAI expressed from a plasmid. Mid log-phase cells analyzed. (F)
As in (E) but late log-phase cells.
66
Not surprisingly, the f=1 model under-predicts the difference between mRNA levels in
GI and S/G2/M (see Figure 3-9, black) for all sets of experimental data. This is also reflected
in the quantitative metrics, with a model prediction of
RA
= 1.2, where experimental
estimates of RM are much larger than 1.2 (Table 3-1). But when f= 2 as expected based
on differences in gene dosage, the model qualitatively describes the progression of the
observed distributions for
PDOAI
expression (Figure 3-7E&F, Table 3-1). This leads to the
conclusion that transcription from a constitutive gene varies across the cell-cycle with gene
dosage. This result should be expected based on simple molecular biology, but has been
underappreciated as a source of extrinsic noise in gene expression, with a couple of exceptions
(Huh & Paulsson, 2011; Volfson et a., 2006). This becomes the null hypothesis for the cellcycle dependence of transcription.
But tetO promoter measurements are not consistent with this null hypothesis, instead
better described by f>2, with f>100 for basal expression (Figure 3-7A-D). Such a large
value of fis consistent with no GI transcription at all, suggesting that transcription at basal
levels is restricted to S/G2/M. This is supported by experimental RM values of 9.3 and 3.1
for basal Pxtt,() and P7 xtto respectively. This corresponds to f values of Infinite fold-change
and >3 fold-change in expression (Table 3-1). But it also shows the lack of precision in these
inferences: an R.
of 6 should be the maximum, corresponding to no expression in G1. Yet
we observe higher than this for basal Pixteto expression, and the lower value of 3.1 for
P7Xet(
expression may actually be due to experimental error in the low direction. Expression at
intermediate levels seems consistent with some GI expression, but fold-changes greater than
explained by gene dosage. PIxteto and P7 xteto expression is best fit by f = 4 and f= 9.
Expression distributions that differ across cell-cycle stages will of course explain some of
the variability in a growing population's expression distribution. But to assess the extent to
which cell-cycle driven changes in transcription level explain noise, we again need more
quantitative analysis. At the DOA1 promoter, all expression noise is explained by Poisson
expression with 2-fold change between GI and S/G2/M. This is evident in the agreement
2 values in Table 3-1. But the
between the horizontal bars in Figure 3-7E&F, and the high X
tetO promoters have somewhat more variable expression than generated by this model of
67
Poisson expression with f-fold change in expression, especially for P7Xteto. We incorporate an
extra potential source of variability into the model, by randomizing the timing of the
transcription rate transition to occur in a uniformly distributed 40 minute window starting
at the beginning of S/G2/M. (This is supported by the real-time protein measurements
reported in the next section.) This predicts distributions that agree better with observations
for tetO, but not for
PDOAl
(Figure 3-8). For both basal and intermediate expression levels
from Pixe
o and intermediate expression from P7XtetO, the model passes the x2 goodness of fit
1
test, indicating most of the variability is explained by this model (Table 3-1). These results
in no way exclude the possibility of other sources of variability
-
for example adding
transcriptional bursting during S/G2/M can also describe the variability, including the
increased variability in P 7xeo expression. The mRNA FISH images for tetO promoters tend
to have bright spots thought to represent nascent mRNA transcription that are more likely
in S/G2/M (Figure 3-4) and may indicate this "bursty- expression is a source of variability.
But it is nonetheless surprising, given former impressions of Piteo as a "noisy" gene, that all
expression variability is described by cell-cycle changes in transcription with only slightly
super-P oissoni an underlying transcription dynamics.
To summarize all of the models, we show each model's best fit to total S/G2/M
expression in Figure 3-9. It shows the expression distributions are well-captured by: f= 2
with Poisson transcription for PDOA1, f~ 10 for intermediate tetO expression and f= Infinite
for basal tetO expression, suggesting no G1 transcription at basal levels. All
PI/7 xtetO
P[)oAi
and most
variability is explained by the change in transcription rate from GI to S/G2/M. But
the remaining P/ 7xteto variability could be explained by variability in the timing of
transitioning to S/G2/M transcription levels.
These are strong statements, that contradict
current understanding of noise in
transcription. While well-supposed by mRNA measurements, this data is static and all
conclusions about temporal cell-cycle changes are inferred from morphology. This data is
therefore well-supported by a technique that collects data that follows cell behavior in realtime.
68
A
Total
populati on
B
0.60.
N=136
0.3
0.2
No
bud
N=79
.:dox
WTA
N=341
4
=$
-
N=191
0.2
N=302
N=22
E
F
0.2 -
N=560
0.2M7
N=41
N=53
0
=13
N=1781 0.2
-Sj
--
N=388
0
0.2
N=47
0
N=282
0.6
0
N=61
0
0.6
N=53
0.3
0
0
~0.3
N=18
1 N=40
0.2 ~
N=55
0.2
N=40
~N=58
0.3
0
N=17
N=31
0.6
0.4
0.3
0.2
0
N=67
0.6-
0
0.1
03O
Large
bud
N=454
0.3
01
0.2J
0.6-
0.3
0
0.1
06
0.3
.:dox
+ WtTA
f6
--
N=125
I
0.10.1
.05
f100 0
0.2
0.3
Small
bud
+
N=237
flnf
0L
Early
bud
C
10 20 30
.- N=42
0
2
N=37
0.6
0
0.1
0
20
40
0
20
1 40
N=52
0.3
)
20 40 60
00
10
20
0
5
10
mRNA Count
Poisson txpn, f> 2, variable timing of transition from GI to S/G2/M transcription rate]
Figure 3-8: mRNA distributionsfrom P1 ~ are better fit by introducing variable timing in the
transition from Gi to increasedS/G2/M transcriptionrates. (A-F) As in Figure 3-7, but with the
model modified to incorporate a random. uniformly distributed transition from G1 to S/G21l
transition rates occurring during a 40 min window after budding.
69
A
B
0.4
N=57
.-dox
C
0.4
+ 'WtTA
N=150
N=112
f=1
c
5
0.2-
0.2
----
0
.-
-1
f=-2
0
5
f=-4
.1
-- ;100
f=1
0
=2
1
N=172
02
0
0 &-
F--Inf 0.4
--- N=170
N 124
5
0A
1 --
0
'0.4
.1
---
5
0
F
.-dox
+ 'tTA
U
012
0
f=-9
0
P=4
A
0.2
0.2
5
02
1
0=1
m
0.4
f--nf 0.4
-
02
^4
A
0
1020
30
0
0
20
40
012
-=13
i
5
0.2
0
f=-6
f100 -1
-
0
0
0&
0
20
0
0
40
20 40
60
0
10
20
0
5
1
mRNA Count
Figure 3-9: Sumrmry of each models fit to S/G2/Ai-specific mRNA distributions.(A-F) Strains
as in Figure 3-7. but comprlIsing a specific comparison of the agregate S/G2 Al distribution and its
mean and standarddeviation. Green bars and horizontal lines represents the experimental S/G2,/Al
disrri/)utiOn and its mean and standard deviation.
70
Table 3-1: Comparing experimental mRNA expression distributions to simple models
A
B
C
D
tetO promoter without tetO promoter with
activator (basal)
tTA activator
1xtetO**
7xtetO*** I1xtetO
7xtetO
E
F
Constitutive DOA
promoter
Higher
Lower
Ratio of mean mRNA at late G2 M to G1 as a measure of fold-change in transcription:
R (measured)
9.3
3.1
2.8
2.7
2.4
4.2
corresponding
Inf
4.3
3.6
3.3
2.7
8.9
Negative binomial fit to total distribution:
The standard model equates a frequency and size of bursting to parameters of negative binomial
fit to total population's stationary distribution.
"Burst frequency"
0.4
0.5
2.2
1.2
4.1
1.1
"Burst size"
7.7
8.5
5.8
10.0
1.4
1.3
2-fold transcription increase: *
Poisson transcription with S/G2/M transcription rate increased 2-fold over G1 transcription rate.
The transcription rate is fit to the experimental total mean mRNA count.
f
2
2
2
2
2
2
2
X fit p-value, Total
0
0
0
0
0.95
0.96
2
X fit p-value, G1
0
0
0
0
0.02
0.002
X2 fit p-value, G2
0
0
0
0.001
0.001
0.001
f-fold transcription increase: *
Poisson transcription with S/G2/M transcription rate increased ffold over G1 transcription rate.
The transcription rate and fare fit to the experimental G1 and G2 mean mRNA count.
f
Inf
100
4
9
4
12
2
X fit p-value, Total
0.06
0.002
0.14
0.07
0.004
0.12
X2 fit p-value, GI
2
X fit p-value, G2
0.69
0.17
0.46
0.84
0
0.002
0.004
0
0.18
0.007
0.78
0.69
f-fold transcription increase with variable timing of transition: *
Poisson transcription with S/G2/M transcription rate increased f-fold over GI transcription rate
with a 40 minute uniformly distributed window of switching to S/G2/M transcription rates after
S/G1. Transcription rate and fare fit to the experimental G1 and G2 mean mRNA count.
Inf
100
6
13
7
18
X2 fit p-value, Total
2
X fit p-value, GI
2
X fit p-value, G2 *
0.85
0.003
0.13
0.15
0
0
0.35
0.06
0.30
0.73
0
0
0.01
0
0.84
0.76
0.06
0.27
* Result of a X2 goodness of fit test of the measured data against the model prediction. Cases that
pass or fail are in black or gray text, respectively with p = 0.05 representing the cutoff.
** For basal expression from
to, the G1 and S/G2/M expression differ more than expected
for no transcription in GI. Thus the data is best fit where f = infinity, representing no GI
transcription.
*** For basal expression from P7xtetO, because of high expression noise and the fact that 10% of
cells have not turned on by late G2/M, specifying f by matching G1 and S/G2/M means gives a
value of f (-7) that fits distributions poorly. The X2 goodness of fit was maximal and fairly constant
over a range of approximately 50 < f< 100. Thus, f= 100 was selected. For the model with variable
timing of transitioning to S/G2/M transcription rates, the estimated 40 minute window was extended
to the full duration of S/G2/M.
71
3.3.5
Real-time fluctuations in protein levels corroborate mRNA
measurements and reveal globally correlated activation
A real-time method for inferring transcription is a strong complement to our mRNA
measurements, which have high mRNA-count resolution but no temporal information. In
our lab, CJ Zopf developed a platform for inferring transcription by tracking fluctuations in
protein levels in single cells growing in microfluidic chambers. The method has a time
resolution of approximately 15 minutes. (See Zopf et a]. (2013) main and supporting text
for details of the method.) Transcription rate was inferred from a diploid strain expressing
fluorescent reporters from two copies of
P7xteto
at homologous loci and a control constitutive,
highly-expressed PGK1 promoter (Figure 3-1OA). For constitutive expression and expression
from a regulated gene at high levels, transcription rate increased approximately two-fold
from G1 to S/G2. This was robust across several different growth conditions, shown here
for growth in glucose (Figure 3-10B, 85 min cell cycle) and raffinose (Figure 3-10C, 210 min
cell cycle). This is strong agreement with our inferences from
A
0
B
2%
Cycle time (mnin)
21.25 42.5 63.75
171N
N
Constitutive
6
0.750
45
E
0.5
1
Regulated
0.25
15
0 52.5 105
2% raffinose
1
-
mRNA expression.
cycle time (rn-dn)
C
85
'2- 45
PDOA,
157.5
210
G)
=246
0.75
30
0.5
15
0.25
x
0
0
0
0.25
0.5
0.75
Cycle progression
1
0 , 4e..
0
0.25
0
0.5
0.75
Cycle progression
1
Figure 3-10: For the constitutive gene and the regulated genes at high expression levels (A).
transcription rate increased approximately two-fiid in S/G2/M veisus GI in two growth cOnditions
(BC). Dots axe the average of expression at each time point of N cells. (This data and the igur1
itself were produced by CJ Zopf.)
72
To then study repressed transcription in real-time, we added 50 ng/mL dox to reduce
P7 ,,,,() expression in the 3-color diploid to levels where transcription is thought to occur in
infrequent, independent bursts at each locus that should be resolvable by the real-time
analysis. But instead of bursts, single-cell traces of transcription rate show occasional "ON"
periods that are restricted to S/G2, generally beginning within 20 minutes of bud formation,
and lasting until division (Figure 3-11A). This also agrees strongly with our measurements
at basal tetO promoters. It offers further information that, rather than just an average foldchange in expression, expression restricted to S/G2/M is also probabilistic. Our data
suggests that all cells do have some expression in S/G2/M, even under basal conditions,
because the zero-peak of the expression distribution drops almost to zero by the end of
S/G2/M. But the resolution of the protein method for detecting low levels of mRNA
transcription is unknown. So both methods are consistent with a model where cells may
switch to strongly-transcribing states in S/G2/M and otherwise transcribe small amount of
mRNA.
A
*CFP only .YFP only
0.25
Avg. G1
B
eBoth
N=288
C
# cycles in bin
5
0
m Avg. S/G2/M
~90
0.2 .
N=324
10
O
0
60
0.15
03
0-
-U
0.1
..
c
0
0.05
0
0
0.5
Cycle progression
N=66
p=0.46
_ -30
1
-30 0
30 60 90
CFP txn start (min, tw = 0)
2
0
F
ONFF
YFP
CFP
Figure 3-11. Transcriptionalbursts from honologous loci are cell-cycle dependent and partially
correlated. The 3-color diploid strain was grown in microfluidics with 50 ng/mL dox, reducing
expression. (A) The probability that each 7xtetO proinoter's transcriptionrate is above background,
computed by averagingindividualcell responses at different cell-cycle progression,increases after GL
(B) A 2D histogram of activation time for each promoter when both activate (t = 0 at budding).
Most activation occurs near budding and is correlated. (C) Classifying single-cell S/G2/M periods
from (A) by whether each P.,
activates reveals correlations in sporadic expression. Error bars
represent SEM from bootstrapping. (This data and the figure itself were produced by CJ Zopf )
73
The diploid cells studied with the protein method offer further information about
correlation of transcription in a given cell. The "on
'2 test;
periods are not independent (p < 10- ,
0.42) at each locus (Figure 3-11C). And if both P 7
1co
copies turn on, > 70%
of the time they do so within 15 minutes of each other (Figure 3-11B). These results
corroborate analysis of mRNA expression, and are in striking contrast to the view of
transcriptional bursting as intrinsically driven with exponential interarrival times (Golding
et a]., 2005; Larson et a]., 2011; Raj et a]., 2006; Raj & van Oudenaarden, 2009).
While increased protein production in S/G2 may be due to increases in translational
capacity, this is unlikely for three reasons. First, while ribosomes numbers and activity are
known to increase in yeast in S/G2 (Elliott & McLaughlin, 1978; Waldron, Jund & Lacroute,
1977), ribosome number is generally not considered rate-limiting for any particular gene as
increasing gene dosage or mRNA number by transcriptional regulation leads to increased
gene expression. Second, recent work in budding (Trcek et al., 2011) and fission (Zhurinsky
et a]., 2010) yeast suggests mRNA levels of constitutive genes increase during S/G2. Third,
we find average protein to mRNA ratios of cells grouped by cell-cycle phase to show no
discernible cell-cycle dependent trend (data not shown).
74
Discussion:
3.4
3.4.1
Implications for understanding stochastic gene expression
Our results indicate the G1 to S/G2 transition has strong effects on transcriptional
activity beyond differences in gene dosage for the tetO promoters, which have characteristics
(strong TATA box, regulable) of "noisy" promoters identified in genome-wide studies (BarEven et al., 2006; Newman et al., 2006). Temporary disruption of a repressed promoter's
chromatin architecture during DNA replication could explain the pulse timing in early S/G2.
Whatever the event, it does not occur independently at homologous loci. Our data alters
the interpretation of studies where static mRNA/protein distributions are fit to stochastic
models of gene expression to infer steady-state dynamics (Mao et a]., 2010; Munsky, Neuert
& van Oudenaarden, 2012; Raj et a]., 2006; To & Maheshri, 2010). This difficulty of using
static data to pinpoint origins of variability has been anticipated (Hilfinger & Paulsson,
2011; Taniguchi et a]., 2010), although even static mRNA FISH data can reveal additional
dynamic information (Wyart, Botstein & Wingreen, 2010), including disaggregating mRNA
distributions by cell-cycle stage
New models incorporating cell-cycle linked pulses of
transcription should alter predictions of gene network behavior. These models will benefit
from further characterization of transcription dynamics across and within cell-cycle phases.,
with greater resolution than afforded by the techniques used here.
3.4.2
Gene activation kinetics are also cell-cycle dependent
Of further interest is whether cell-cycle also drives the kinetics of a gene's response to a
changing environment. CJ Zopf used the real-time protein tracking platform to investigate
the cell-cycle dependence of gene activation kinetics (Zopf et a]., 2013). He measured the
time to activate
Plt(
and P 7 o in response to a step change in transcription factor (TF)
input. When the signal arrived during early S/G2, activation mostly occurred during S/G2.
When the signal arrived during G1, activation was often delayed until S/G2. When the
signal arrived during late G2/M, activation was sometimes delayed until the following S/G2,
almost an entire cell-cycle later. Thus activation is cell-cycle dependent and enriched in
75
S/G2. This suggests that stationary dynamics can inform about kinetic behavior, and that
cell-cycle plays a role in the kinetics of a gene network's response to changes in signaling.
3.4.3
A hypothesis that chromatin maturation permits repressed
transcription
The notion that nascent chromatin may permit transcription from repressed genes
following DNA replication is longstanding, and suggests a most interesting source of cellcycle dependent transcription. In 1991, Wolffe (1991) suggested that the open chromatin
structure of newly-replicated DNA might allow for formation of an active transcription
complex; Guptasarma (1995) hypothesized that even in E coli, the unwrapping of DNA
during replication may allow for transcription of repressed genes. Experimental studies
followed, with a demonstration that nascent chromatin provides a transient period in which
basal transcription can occur (Almouzni & Wolffe, 1993) and in which an activator could
bind (Kamakaka, Bulger, & Kadonaga, 1993). Cesari et a]. (1998) used cycloheximide to
uncouple DNA replication and chromatin assembly, which induced S-phase transcriptional
activation. Later examples show that replication can disrupt epigenetic states to allow
transcription, e.g. from heterochromatic repeats (Chen et aL, 2008). Of broader interest,
activation of transcription is linked to DNA replication in several cases in development (e.g.
Fisher & Mechali, 2003). It may also have a role in disease, such as derepression of an
oncogene. Crowe et a]. (2000) showed that unscheduled, accelerated replication contributes
to chromatin
accessibility.
derepression
This evidence
by diluting repressing
of links between
factors and enhancing
chromatin
remodeling
activator
and repressed
transcription informs our hypothesis that post-replication chromatin maturation creates the
S/G2 window for transcriptional activation, explained in Section 5.2.
76
3.5
References
Almouzni, G., & Wolffe, A. P. (1993). Replication-coupled chromatin assembly is required
for the repression of basal transcription in vivo. Genes & Development, 7(10), 20332047.
Bar-Even, A., Paulsson, J., Maheshri, N., Carmi, M., O'Shea, E., Pilpel, Y., & Barkai, N.
(2006). Noise in protein expression scales with natural protein abundance. Nature
Genetics, 38(6), 636-643.
Cesari, M., Heliot, L., Meplan, C., Pabion, M., & Khochbin, S. (1998). S-phase-dependent
action of cycloheximide in relieving chromatin-mediated general transcriptional
repression. Biochemical Journal 336(3), 619-624.
Charvin, G., Cross, F. R., & Siggia, E. D. (2008). A microfluidic device for temporally
controlled gene expression and long-term fluorescent imaging in unperturbed dividing
yeast cells. Plos One, 3(1), e1468.
Chen, E. S., Zhang, K., Nicolas, E., Cam, H. P., Zofall, M., & Grewal, S. I. S. (2008). Cell
cycle control of centromeric repeat transcription and heterochromatin assembly.
Nature, 451(7179), 734-737.
Choi, P. J., Cai, L., Frieda, K., & Xie, X. S. (2008). A stochastic single-molecule event
triggers phenotype switching of a bacterial cell. Science, 322(5900), 442-446.
Chubb, J. R., Trcek, T., Shenoy, S. M., & Singer, R. H. (2006). Transcriptional pulsing of
a developmental gene. Current Biology, 16(10), 1018-1025.
Colman-Lerner, A., Gordon, A., Serra, E., Chin, T., Resnekov, 0., Endy, D., Brent, R.
(2005). Regulated cell-to-cell variation in a cell-fate decision system. Nature,
437(7059), 699-706.
77
Crowe, A. J., Piechan, J. L., Sang, L., & Barton, M. C. (2000). S-phase progression
mediates activation of a silenced gene in synthetic nuclei. Molecular and Cellular
Biology, 20(11), 4169-4180.
Elliott, S. G., & McLaughlin, C.
S. (1978). Rate of macromolecular synthesis through the
cell cycle of the yeast saccharomyces cerevisiae. Proceedings of the National Academy
of Sciences, 75(9), 4384-4388.
Elowitz, M. B., Levine, A. J., Siggia, E. D., & Swain, P. S. (2002). Stochastic gene
expression in a single cell. Science, 297(5584), 1183-1186.
Fisher, D., & Mechali, M. (2003). Vertebrate HoxB gene expression requires DNA
replication. The EMBO Journal,22(14), 3737-3748.
Garf, E., Piedrafita, L., Aldea, M., & Herrero, E. (1997). A set of vectors with a
tetracycline-regulatable promoter system for modulated gene expression in
saccharomyces cerevisiae. Yeast, 13(9), 837-848.
Gillespie, D. T. (1977). Exact stochastic simulation of coupled chemical reactions. The
Journal of Physical Chemistry, 81(25), 2340-2361.
Gossen, M., & Bujard, H. (1992). Tight control of gene expression in mammalian cells by
tetracycline-responsive promoters. Proceedings of the National Academy of Sciences,
89(12), 5547-5551.
Golding, I., Paulsson, J., Zawilski, S. M., & Cox, E. C. (2005). Real-time kinetics of gene
activity in individual bacteria. Cell, 123(6), 1025-1036.
Guptasarma, P. (1995). Does replication-induced transcription regulate synthesis of the
myriad low copy number proteins of escherichia coli? Bioessays, 17(11), 987-997.
Hilfinger, A., & Paulsson, J. (2011). Separating intrinsic from extrinsic fluctuations in
dynamic biological systems. Proceedingsof the National Academy of Sciences,
108(29), 12167-12172.
78
Huh, D., & Paulsson, J. (2011). Non-genetic heterogeneity from stochastic partitioning at
cell division. Nature Genetics, 43(2), 95-100.
Kamakaka, R. T., Bulger, M., & Kadonaga, J. T. (1993). Potentiation of RNA polymerase
II transcription by Gal4-VP16 during but not after DNA replication and chromatin
assembly. Genes & Development, 7(9), 1779-1795.
Larson, D. R., Zenklusen, D., Wu, B., Chao, J. A., & Singer, R. H. (2011). Real-time
observation of transcription initiation and elongation on an endogenous yeast gene.
Science, 332(6028), 475-478.
Maiuri, P., Knezevich, A., De Marco, A., Mazza, D., Kula, A., McNally, J. G., &
Marcello, A. (2011). Fast transcription rates of RNA polymerase II in human cells.
EMBO Reports, 12(12), 1280-1285.
Mao, C., Brown, C. R., Falkovskaia, E., Dong, S., Hrabeta-Robinson, E., Wenger, L., &
Boeger, H. (2010). Quantitative analysis of the transcription control mechanism. Mol
Syst Biol, 6
Munsky, B., & Khammash, M. (2006). The finite state projection algorithm for the
solution of the chemical master equation. The Journalof Chemical Physics, 124(4),
Munsky, B., Neuert, G., & van Oudenaarden, A. (2012). Using gene expression noise to
understand gene regulation. Science, 336(6078), 183-187.
Muramoto, T., Cannon, D., GierliA,,ski, M., Corrigan, A., Barton, G. J., & Chubb, J. R.
(2012). Live imaging of nascent RNA dynamics reveals distinct types of
transcriptional pulse regulation. Proceedingsof the NationalAcademy of Sciences,
109(19), 7350-7355.
Newman, J. R. S., Ghaemmaghami, S., Ihmels, J., Breslow, D. K., Noble, M., DeRisi, J.
L., & Weissman, J. S. (2006). Single-cell proteomic analysis of S. cerevisiae reveals
the architecture of biological noise. Nature, 441(7095), 840-846.
79
-.
O'Neill, E. M., Kaffman, A., Jolly, E. R., & O'Shea, E. K. (1996). Regulation of PHO4
nuclear localization by the PH080-PHO85 cyclin-CDK complex. Science, 271(5246),
209-212.
Paulsson, J., & Ehrenberg, M. (2000). Random signal fluctuations can reduce random
fluctuations in regulated components of chemical regulatory networks. Physical
Review Letters, 84(23), 5447.
Peccoud, J., & Ycart, B. (1995). Markovian modeling of gene-product synthesis.
Theoretical PopulationBiology, 48(2), 222-234.
Pedraza, J. M., & van Oudenaarden, A. (2005). Noise propagation in gene networks.
Science, 307(5717), 1965-1969.
Raj, A., Peskin, C., Tranchina, D., Vargas, D., & Tyagi, S. (2006). Stochastic mRNA
synthesis in mammalian cells. PLoSBiol, 4(10), e309-e309.
Raj, A., & van Oudenaarden, A. (2009). Single-molecule approaches to stochastic gene
expression. Annual Review of Biophysics, 38(1), 255-270.
Raser, J. M., & O'Shea, E. K. (2004). Control of stochasticity in eukaryotic gene
expression. Science, 304(5678), 1811-1814.
Shahrezaei, V., Ollivier, J. F., & Swain, P. S. (2008). Colored extrinsic fluctuations and
stochastic gene expression. Mol Syst Biol, 4
Suter, D. M., Molina, N., Gatfield, D., Schneider, K., Schibler, U., & Naef, F. (2011).
Mammalian genes are transcribed with widely different bursting kinetics. Science,
332(6028), 472-474.
Tan, R. Z., & van Oudenaarden, A. (2010). Transcript counting in single cells reveals
dynamics of rDNA transcription. Mol Syst Biol, 6
80
Taniguchi, Y., Choi, P. J., Li, G., Chen, H., Babu, M., Hearn, J., Xie, X. S. (2010).
Quantifying E. coli proteome and transcriptome with single-molecule sensitivity in
single cells. Science, 329(5991), 533-538.
To, T., & Maheshri, N. (2010). Noise can induce bimodality in positive transcriptional
feedback loops without bistability. Science, 327(5969), 1142-1145.
Trcek, T., Larson, D., Moldan, A., Query, C., & Singer, R. A. (2011). Single-molecule
mRNA decay measurements reveal promoter- regulated mRNA stability in yeast.
Cell, 147(7), 1484-1497.
Volfson, D., Marciniak, J., Blake, W. J., Ostroff, N., Tsimring, L. S., & Hasty, J. (2006).
Origins of extrinsic variability in eukaryotic gene expression. Nature, 439(7078), 861864.
Waldron, C., Jund, R., & Lacroute, F. (1977). Evidence for a high proportion of inactive
ribosomes in slow-growing yeast cells. BiochemicalJournal, 168(3), 409-415.
Wolffe, A. (1991). Implications of DNA replication for eukaryotic gene expression. J Cell
Sci, 99 (Pt 2), 201-206.
Wyart, M., Botstein, D., & Wingreen, N. S. (2010). Evaluating gene expression dynamics
using pairwise RNA FISH data. PLoS Comput Biol, 6(11), e1000979.
Zenklusen, D., Larson, D. R., & Singer, R. H. (2008). Single-RNA counting reveals
alternative modes of gene expression in yeast. Nat Struct Mol Biol, 15(12), 1263-1271.
Zhurinsky, J., Leonhard, K., Watt, S., Marguerat, S., B-Ahler, J., & Nurse, P. (2010). A
coordinated global control over cellular transcription. Current Biology, 20(22), 20102015.
Zopf, C. J., Quinn, K., Zeidman, J., & Maheshri, N. (2013). Cell-cycle dependence of
transcription dominates noise in gene expression. PLoS Comput Biol, 9(7), e1003161.
81
CHAPTER 4. Characterization of the tetO gene regulatory
function using cell-cycle arrested mRNA expression
4.1
Abstract
Isogenic cells in identical conditions display variability in expression. This variability in
gene expression impacts the dynamics and function of gene regulatory networks. Predicting
the nature of this impact depends on the extent and statistics of the noise, which in turn
depends on the biological and molecular origins of the noise. Hence, a prerequisite for
understanding or designing the function of gene regulatory networks is to characterize the
origins and statistics of the noise, in response to regulatory conditions. The paradigm that
noise arises from stochastic "bursts" of transcription, is challenged by evidence of previouslyunappreciated extrinsic sources of noise.
We showed in Chapter 3 that large differences in transcriptional activity across the cellcycle dominate noise in transcription from the tetO promoter in budding yeast. Here, we
further characterize the origins of noise in cell-cycle driven transcription across activation
levels and for two promoters differing by operator number, to determine the origins of
expression noise. By measuring single cell mRNA distributions in cells arrested at the G1/S
and G2/M transitions, we find: (1) gene activation in S/G2 is correlated between each locus
with a probability dependent on activator levels, (2) there is no GI transcription below a
certain activator threshold, (3) mRNA distributions from an active tetO promoter with a
single operator are Poisson but from an active multiple operator tetO promoter they are
super-Poissonian. Similar conclusions can be made for the native PHO5 promoter, indicating
these results are not an artifact of using a synthetic gene. These results confirm that much
of the variability in mRNA levels is due to the large differences in the probability of
activation between cells in GI and S/G2. However, they also show the interplay of aspects
of promoter architecture that have been associated with affecting noise, such as increased
operator number, with cell-cycle dynamics to provide an integrated picture.
82
4.2
Introduction
Noise in gene expression is the sum of all sources of variability leading up to transcription
and expression of a gene. Some variability results from the stochastic, single-molecule
biochemical reactions that comprise steady-state transcription at a gene loci. Experimental
evidence suggests these dynamics of transcription are dominated by brief bursts of activity
followed by long inactive periods, with "burstiness" modulated by promoter architecture and
correlated with expression range (Cai, Friedman & Xie, 2006; Chubb et a]., 2006; Golding
et a., 2005; Raj & van Oudenaarden, 2009; Taniguchi et a]., 2010). Noise in expression is
often attributed to such dynamics, and used to infer how regulatory elements control
transcription (see Table 1-1 for examples). But noise also arises from fluctuations in global
cell activities that modulate transcription or expression, and may result in expression with
statistics that agree with transcriptional bursting (Figure 3-3). Even noise that appears
intrinsic may depend on extrinsic factors in the cell's history (Hilfinger & Paulsson, 2011).
Dynamics inferred from expression noise without accounting for extrinsic variability overpredict noise at the promoter and incorrectly infer regulatory modes.
In previous work, we demonstrated that expression noise from synthetic tetO promoters
in budding yeast is largely due to differences in transcriptional activity between the GI and
S/G2 stages of the cell-cycle (Zopf, Quinn et a]., 2013, Chapter 3). But our previous methods
limited further characterization. Fluctuations in protein levels have low time- and molecularresolution. Expression across a snapshot of a growing population has no temporal
information and cell-cycle fluctuations in transcription are blurred by slow mRNA turnover.
Data obtained with these methods could not discriminate between multiple qualitative
descriptions of how transcription changes across the cell-cycle, did not titrate activator levels
and was limited to the synthetic tetO promoter.
Here, we seek to quantitatively characterize transcription dynamics across the cell-cycle
in response to activator levels and the number of promoter binding sites. Analysis of these
measurements enables us to arrive at a model of the transcription dynamics and expression
at a single gene locus across the cell cycle. We extend our investigation to the native PHO5
promoter and discuss the consequences of naively interpreting all transcriptional noise as
intrinsic in origin.
83
4.3
Results:
4.3.1
Analysis of single-molecule
nuclear and cytoplasmic mRNA FISH in
arrested cells reveals instantaneous cell-cycle dependent transcription
While we previously found much of the transcriptional variability of tetO promoters was
due to cell-cycle phase dependent differences, we were unable determine to what extent
variability driven by noisy expression within each cell-cycle phase conttibuted to the overall
noise. Moreover, it was difficult to distinguish whether a promoter was active.
In previous work in Chapter 3, we utilized mRNA FISH measurements of Venus YFP
transcripts driven from genomically integrated tetO promoters to infer transcriptional
activity. Because transcripts of Venus YFP have a half-life
(t/
2
= 20 minutes, To &
Maheshri (2010), Figure 6-2) on order of G1 (-60 minutes) or S/G2/M (-60 minutes) in
fast-growing conditions, steady-state YFP transcript levels represent
a time-averaged
activity of expression across different cell-cycle stages and are hence poor reporter of cellcycle stage-specific transcription. This prevented us from determining to what extent
variability within a cell-cycle stage contributed to overall noise. Moreover, it was difficult
to precisely determine whether a promoter was active in G1, as the presence of mRNA could
potentially be due to activity in the previous S/G2/M stage.
We reasoned that steady-state mRNA FISH measurements of tetO promoter-driven
Venus mRNA in cells arrested at the G1/S checkpoint (by addition of alpha-factor) or G2/M
checkpoint
(by addition of iiocodazole) for several mRNA lifetimes would reflect the
transcription dynamics when rapidly growing cells were in GI or S/G2/M, respectively.
To validate this approach, we tested whether arrest affects normal transcription
dynamics. To do so, we measured stationary mRNA expression distributions in both growing
populations (stratified by cell-cycle stage according to our previous method, Chapter 3.3)
and G1/S or G2/M arrested populations. Figure 4-1A describes timing of mRNA FISH
measurements post-arrest. Because an asynchronously growing population is arrested, cells
will exist in all cell-cycle stages. The vertical timeline in Figure 4-1A describes events that
would occur for a cell that was at the beginning of S phase and had a cell cycle of length 2
hours (1 hr for GI and 1 hr for S/G2/M, as measured for these cells). So for example, if a
S phase cell was arrested with nocodazole and fixed and analyzed after 5 hours of nocodazole
84
treatment, it would actually have been arrested just under 4 hours (since it would take just
under 1 hour for the cell to reach the G2/M checkpoint). Similarly, if a S phase cell was
arrested with alpha-factor and fixed and analyzed after 5 hours of alpha-factor treatment,
it would actually have been arrested for 3 hours (since it would complete a full 2 hour cell
cycle before it reaches the G1/S checkpoint). The curly braces in Figure 4-1A describe the
range of time various members of the cell population have actually been arrested subsequent
to 2,5, or 8 hours of nocodazole treatment, or 3 or 5 hours of alpha-factor treatment.
After these cells are fixed and stained, mRNA numbers per cells are automatically
determined
using
custom image analysis
software,
and
the
corresponding
mRNA
distributions can be generated. Experimental error in mRNA FISH measurements like these
arise from technical error in the actual measurement, biological error between biological
replicate samples. mRNA FISH has been validated to not suffer from systematic technical
error (Raj et a., 2008; and To & Maheshri, 2010) and the measurement error is dominated
by biological error. Based on multiple biological replicate measurements on different days,
the biological error is never > 10% and usually ranges from 5-10%. When the mean and
variance are estimated from these distributions the sampling error is < 5% as generally
distributions are constructed from N >100 cells (N=186 +/- 95 cells per sample on average
+/ - st. dev. --see Appendix 6.4 for error analysis.) For cells in S/G2/M, two copies of the
tetO promoter are present. Therefore, to control for this difference and facilitate comparison
with cells in G1, we present the distribution of mRNA per gene copy. We do so by subjecting
the distribution to binomial partition (random assignment of each mRNA to each gene
copy). Hence data described here controls for the effects of gene dosage and any differences
between G1- and G2/M-arrested distributions are due to additional cell-cycle dependent
effects.
We first employed the above protocol on cells containing a genomically-integrated copy
of the promoter of the low expressed DOA1 housekeeping gene
(PDOA1).
This gene has
previously been reported to generate a near-Poisson steady-state mRNA distribution and
hence would not be expected to show large differences in expression in G1- and G2/Marrested cells. We find that the mRNA distribution does not change between 3 and 5 hours
after alpha-factor arrest, and also looks similar at 2,5, and even 8 hours after nocodazole
85
arrest (Figure 4-2A). This verifies that arrest is not affecting expression. The Gi-arrested
distributions (blue bars in Figure 4-2A) have slightly higher means than S/G2, indicating
there is no large decrease in expression in Gi-arrested cells relative to G2/M-arrested cells.
The high expressing
(activator levels at 95% of the maximum) has also been
PNtet0
previously shown to exhibit only 2-fold gene-dosage differences in transcriptional activity in
GI versus S/G2/M stages. Similar to PnOA1, steady-state expression is achieved by 3 (alphafactor) or 5 (nocodazole) hour arrest, and the distributions look equivalent. Interestingly, at
8 hours post-nocodazole arrest the mean mRNA levels decrease significantly, suggesting that
very prolonged arrest may alter transcript levels. Still, we can conclude that arrest has no
global effects on gene expression at least 3-5 hours post-arrest (which constitute many
mRNA lifetimes).
C
B
A
0
S
.S
1
2
G42
M
is
.S
.
S(G2
M
Mitosis
2h Nocodazole
treatment
= G-2h arrest
3
3h alpha-factor
=4
5h Nocodazole
treatment
=3-5h arrest
S6
8
9
treatment
= 1-3h arrest
Sh alpha-factor
treatment
=-3- 5h arrest
8h Nocodazole
treatment
= 6-8h arrest
U EU
Eu
Eu Lii
o
30
60
0
30
60
60
30
0
Cytoplasmic mRNA count
per cell
Figure 4-1: mRNA FISH of cell-cycle arrested populations. (A) Asynchronously Arowng
populations of yeast cells were arrestedwith (left) nocodazole or (right) aipha-lactor, and cells we
fixed 2.5., and 8 hours post nocodazole treatment or 3 and 5 hours post-treatment for alpha-factor.
Because the population is growing asynchronously, cells experience varied times of actual arrest.
depicted lby the range (see text lbr detais). (B) Representative mRNA FISH images of (top row)
row) G2/Mil-arrested cells.
highly-expressed Pas,5 h-o growing (middle row) Gl-arrested and (bottom
Bright field and fluorescent images of fixed cells hybridized with mRNA probes are shown in the left
and middle columns, with mRNAs detected using custom image analysisroutinesoverlaid in the right
column. (C) Subsequent to image analysis, the correspondingmRNA distributionscan be construct(ed.
shown here for all S/G2/ Algrowing (orange). Gl arrested(blue) and G2/Al-arrested(red) cells with
highly-expressed P,--,Ie. Above each distribution is a box whisker plot where the black dot describes
the mean of the distribution, the edges of the box denote the 25'h to 7511, percentile of the histogra.,
and the ends denote the 1 (yh and 901 percentile. Because samples in S/G2/1l have two copies of the
tetO promoter. we construct an mRNA distribution on a per gene copy basis to facilitate direct
comparison with distributions from cells in G1. This distribution is constructed by binomially
partitioningthe measured S/G2/ l distribution (see text).
86
When activator levels lower, we previously inferred large differences between expression
in S/G2M and GI given there was a 4-to-6-fold difference in respective mRNA levels when
measured in growing populations. After 3 hours of arrest in G1, we would expect any
contribution of mRNA transcripts arising during the previous S/G2/M stage would
disappear due to degradation. Gi-arrested mRNA levels from the
P7 xtetO
promoter at 13%-
activation fall from 9 mRNA on average per cell in growing GI cells to just 1 mRNA per
cell with a, median of 0 mRNA after 3 hours arrest (Figure 4-2C). Therefore 3 hours of GI
arrest, which lasts 6-to-12 mRNA half-lives, reflects steady-state GI expression.
After 2 hours of G2/M arrest (where the asynchronous population experiences between
0 to 2 hours of actual arrest), the expression is slightly lower than growing cells identified
as near G2/M based on bud size. Interestingly, whereas expression at 95% activation was
unchanged until 5 hours of arrest, G2/M expression at 13% activation falls between 2 hours
and 5 hours of arrest, suggesting activator-dependent stability of the active state. (This
trend is identical at 5% activation, data not shown.) This activator-dependent decrease in
mRNA also confirms that arrest has not simply ceased all production and degradation of
mRNA. Similarly, 2 hours of G2/M arrest is a pseudo-steady-state approximation of late
G2/M activity. Extended durations of G2/M arrest tell about the stability of the G2/M
active state in the absence of the mitosis/G1 transition.
87
A
B
0
C
0
4
42
6
.
8
~dLte
S2(
SG2
S6
-
4
_________4
6
8
8
10
40
60
80
0
0
20
Cytoplasmic mRNA per gene copy Cytoplasmic
D
01
o
13%'
6
10
15
0
5
Cytoplasmic mRNA per gene copy
3
0I
-2
2'~~
-
6
7XDO
~_
0L________
____
_
_
_
_
13%,
2
4
.4
--
G2 arrest
G1
40
30
per gene copy
E
0[
arrest
G1 arrest
20
mRNA
arrest
6
6
8 G2 arrest
CD 8
8
0
0-25
0-5
0.75
Fraction of active cells
1
0
025
0,5
0.75
1
Fraction of active cells
Figure 4-2: Cvtoplasmic mRNA distributions and active cell fractions various times afer cell(ycle arrest. (AB, C) mRNA expression distributionsare presented using box and whisker plots as in
Figure 4-iC for the (A) constitutive DOA 1 promoter. (B) a highly activated 7.tetO promoter. and
(C) a lower expressing 7xtetO promoter. In both (A.) and (B), arresteddistributionsplateau at 3 (for
(dpha-factoi) and 5 (fornocodazole) hours post-arrest.indicatingarrest does not affect transcriptional
activity in this time range. Gi- (blue) and G2/Al- (red) arrested cell distributions are nearly
equivalent. hence no significant cell-cycle stage differences in transcri>tionalactivity are present. In
contrast, at lower expression, mRNA levels in G-arrestedsteadily decrease from that in late G2A.
till there are no mRNVA present 3 hours post-arrest. This shows no expression in Gi 4 hours after Sphase. Expression under G2/ l arrest decreases too, though slower. (D) The fraction of active cells,
indicated by a nuclearspot, remains high for 95%-activatedP-to.Note that cells in G2/AI have two
gene copies: if either is active then the cell appears active. (E) The fraction of active cells in GI for
13%-activated P,-.,o drops to zero after A-phase and 3 hours of arrest, showing clear lack of G1
activity. Experimental error is not shown, but error in mean expression is 5-10% between sample
replicates and an additional< 5% due to population sampling.
88
But, compared with the mRNA at the site of transcription in the nucleus, this
cytoplasmic mRNA is a relatively delayed readout of transcriptional activity. It may also
contain additional noise from mRNA processing and export. Nuclear mRNA is visible as a
bright spot in mRNA FISH images and aligns with a DAPI stain (Figure 4-1B). Partially
complete nascent mRNA transcripts may contribute significantly, but we interpret the spot
intensity as the equivalent number of full transcripts.
PDOA,
expression produces no bright
nuclear mRNA spots because of its low, Poisson-distributed expression. Measurement of the
rate of nuclear spot degradation with thiolutin treatment finds the lifetime of the nuclear
spot is similar to cytoplasmic mRNA (see Appendix Figure 6-2). Therefore, presence of a
nuclear spot indicates active or recent transcription, and its intensity indicates the
magnitude of activity; absence of nuclear mRNA is indicates that a cell has been OFF for >
10s of minutes given its 20 minute half-life. Across all samples, average cytoplasmic mRNA
counts are double the average nuclear mRNA counts (Figure 6-3), suggesting nuclear export
occurs at twice the rate of cytoplasmic mRNA degradation (Appendix 6.3.2).
The fraction of active cells during arrest corroborates that arrest maintains growing
transcription dynamics. It is high for 95%-activated
arrest (Figure 4-2D). For 13%-activated
P 7 .,1o,
P 7 ,to
expression and maintained during
few cells are active in GI arrest (3%). We
employ more detailed modeling of transcriptional dynamics later in this chapter. But we can
quickly assess whether expression in growing GI cells agrees with transcription at arrested
GI levels buffered by history from higher S/G2/M transcription levels. It is consistent, with
37% of cells active, versus 30-40% expected from the sum of degradation of S/G2/M
expression and transcription at low Gi-arrest levels (Figure 4-2E). (This is also true for
median cytoplasmic mRNA expression, observed at 5 mRNA for growing GI cells, compared
with an expectation of 5-8 mRNA.)
Therefore cytoplasmic and nuclear mRNA expression under arrest in GI (3 hours) and
G2/M (2 hours) is a readout of GI and S/G2/M transcription dynamics. The data also
suggests that extended periods in G2/M have a similar but dampened effect to the M/G1
transition in turning OFF transcriptional activity at low and intermediate expression.
89
4.3.2
mRNA expression under cell-cycle arrest reveals that activator
regulates probabilistic activation of a long-lived transcribing state in S/G2
To explore a large domain of regulatory conditions, we titrated expression over large
range of activator levels at the tetO promoter with 1 and with 7 activator binding sites,
then arrest growth and measure mRNA expression (Figure 4-3). The trend in how mRNA
levels change with activator level is striking: the ratio of G2/M to G1 expression per loci
shifts from infinite (no GI expression) to equivalent as activator levels increases (Figure
4-3A,B). For both Pxtet() and
P7
tet 0 , basal expression occurs in S/G2/M, but not in G1
(Figure 4-3, blue). This confirms our previously inferred results (Zopf et a]., 2013). Nuclear
mRNA measurements in Gi-arrested cells reveal a clear threshold activator level below
which there is no expression in GI (40% for Pixeo, 20% for
P 7xteto;
Figure 4-3C,D).
The trend in the fraction of cells with nuclear mRNA spots ("ON" cells) reveals new
information about underlying dynamics. The fraction of ON cells increases just as much as
the overall increase in expression levels (Figure 4-3C,D), meaning that cells' likelihood of
activity, and not the productivity of an active cell, is the activator's major point of control.
Because we infer dynamics from static measurements, we cannot directly measure the
dynamics of switching between these ON and OFF states. But the rate can be confidently
estimated using the 20-30 minute nuclear mRNA half-life and the extent of separation
between ON and OFF subpopulation expression (Figure 4-3E-H). If the timescale of
switching is faster than 20 minutes, most cells will appear ON. If switching is slower,
cytoplasmic expression will be similar between cells with or without nuclear mRNA. Because
we observe a large separation between cytoplasmic mRNA expression levels of ON and OFF
cells (Figure 4-3E-H), the timescale of switching must be longer than the mRNA lifetime.
Given we know expression changes at the M/G1 transition, we conclude sustained expression
in S/G2. From this, and the previous section, we can conclude that the tTA activator
regulates the likelihood of activation in S/G2.
90
A
0%
=.rrxteto
G1.arrest
25%4
G2 arrmt
,509
7xtetO1
25%
50%
_
75%-_
759%
60
40
20
0
Cytoplasmic mRNA per gene copy
30
20
10
0
Cytoplasnmic mRNA per gene copy
C
-
0%
7xt etO
ixtetO
w25%
25%
Ummmmm
0
50%%m
1650%
75%I
0
75%
0.25
0.5
075
1
0
0.25
0.5
0.75
Fraction of active cells
Fraction of active cells
E 0%
IxtetO*
-
t25%
25%
50%
50%
75%
75%
G 0%
30
20
10
CytoplasMIc mRNA per gene
copy in ON cells
-
IxtetO
0
HO%
25%
25%
50%
50%
<75%
75%
0
30
20
10
Cytoplasmic mRNA per gene
copy in OFF cells
80
40
20
Cytoplasniic mRNA per gene
copy in ON cells
--.--
0
7xtetO
-~
-
0
1
-
7xtetO
60
40
20
Cytoplasmic mRNA per gene
copy in OFF cells
Figure 4-3: Activator titration of mRNTA expression under cell-cycle arrest. (AB) For P, utO and
P 7. 00, there is no niRNA expressed in Gl arrestedcells at low activator levels. As activatorincreases.
expression in Gl and S/G2/M is equivalent, when examined on a per gene copy basis. (C.D) Increase
in the fraction of active cells explains most of the dynamic range. (E-I) Expression substantially
differs between the ON (EF)and OFF (GH) subpopulations. indicating a long-lived active and
inactive state. Activator-dependent increase in OA-state expression is small, an(l later proves to he
due mereli to the increasing fraction of S/G2/il[ ON cells with 2 rather than 1 active loci.
Experirnentaleror is 10-15Y( (Appendix 6.4).
91
4.3.3
Conditional variances quantify the origins of gene expression noise
Given expression level varies substantially between cells in GI or S/G2/M and with or
without nuclear mRNA ("ON" or "OFF"), a breakdown of total expression variance into the
contribution from each subpopulation is a useful visualization of the origins of noise. First,
we review the trend in noise across growing populations (Figure 4-4, B&E, black). The Fano
factor of growing population expression is higher for P7xtetO than Pixteto, as previously reported
for these promoters (To & Maheshri, 2010). In both cases, the Fano factor increases at low
levels of activator then plateaus or decreases at higher levels of activator. (In the bursting
model, the Fano factor is equivalent to the "burst size", and we revisit this interpretation in
Section 4.4.1.) Breaking down growing populations by cells' activity state (ON/OFF) and
cell-cycle stage begins to understand this trend. Figure 4-4A&D shows growing populations'
cytoplasmic mRNA expression distribution broken down into cell-cycle subpopulations. G1
expression distributions differ from S/G2/M.
We can decompose the variation in mRNA levels (X) due to the conditions (C) of cellcycle stage and whether the promoter is ON/OFF using the aw of Total Variance:
4-1
Var(X)=E(Var[XC])+Var(E[XC])
The first term is the weighted sum of the variance within each conditioned
subpopulation, such as cells in G1. This may still contain variability from extrinsic sources
that affect expression in a given state, along with the variance from stochastic singlemolecule transcription events. The second term is the variance of the mean of each
subpopulation, and hence describes the variability due to differences between expression in
the different conditions. Figure 4-4B&E breaks down the variance (normalized by population
mean) into variance derived within (labeled by the condition) and between the cell-cycle
stages (labeled by "Between"), and Figure 4-4C&F normalize by total variance to see relative
contributions. We see differences in expression across cell-cycle alone contribute 20-30% of
total expression variance. But the slow degradation of mRNA means that expression in each
cell-cycle stage contains history of transcription in previous cell-cycle stages. Thus the
variance
"Between" the
subpopulations
under-predicts
the
variance
derived
transcriptional differences between these cell-cycle and activity (ON/OFF) states.
92
from
B
A
0.08
E
D
0.04
30
1xteto
7xtetO
30
7xteto
1%
0.02
E
A
0.08
1xtetO
13%
-004
0-08
1xteto
0.02
E
C
0
50%
004
1xteto
0.08
95%
0.04
a
0
-
20
40
Cytoplasuic mRNA
count per cell
13%t,
0.04
Cz
o
100 0
50
25
S00
75%
7xtetO
0.04
5 75% 5
50%
50
75%
50
75%
F
100
0.02
C5%07
75
0.04
0
25
0
25
7xtetO
95%
0.02
0
25
50
Activation
A
25
0
75%
(50
40
80
Cytoplasmic mRNA
count per cell
0
25
Activation
Population Fano factor
-m
(CL
S
M .2 lid
S,,G2/NM late
Figure 4-4: Variabilityin growingpopulations apportioned to variance within and between cellcycle stages. (B.E, black triangles) The Fano factor of growing populations increases and then
plateaus, at a higher noise level lbr P,,o and Pit.u The origin of this trend is revealed later in
Figure 4-12 (A) Growing population P,,, cytoplasmic mRNA expression distributions for Ibur
activator levels (1%. 13%, 50%, 95% activation). G1 and S/G2/Al expression differ substantially.
(BC)Breakdown of total variance in cytoplasmic mRNA expression. normalized by populationmean
to give units of mRNA, into variance from within and between cell-cycle stages. (D.E.F) Pa)
expression, as for (A,B, C)
93
However, in the previous section we found that the fraction of active cells is a large
difference between cells in GI or S/G2/M. Figure 4-5 shows expression and variance
conditioned on the subpopulations of active (ON) or inactive (OFF) cells in G1 and
S/G2/M. As before, Figure 4-5B&E and Figure 4-5C&F show contribution to normalized
and total
variance
for Pxteto
and P 7 teto
respectively.
The
variance
between these
subpopulations is substantial, at 50-70%. Of course, the S/G2/M ON subpopulation includes
cells with 1 or 2 loci ON, and noise is approximately apportioned between the two here. Not
all variability within the subpopulations is due to stochastic transcription. Some expression
variability in Gi-OFF cells at lower activator levels is from levels of mRNA decreasing
throughout G1 (which has lower transcription rate than the preceding S/G2/M). It also
reflects fluctuations in mRNA processing and export and variable global cell activity,
perhaps related to cell size. This leaves stochastic, single-molecule promoter fluctuations
contributing no more than 25% of the total noise in expression from these tetO promoters.
A
B
06
IxtetO
1%
0A
D
E
30
-.
Z4
7xtetO
0.6
0.4
0.2
1IxtetO
0.2
1%
7xtetO
0*
0
0.1
0.04
1xtetO
13%
S0 or1Ixteto
0.04
0.02
0
50%
7xtetO>
13%
10
0
25
50
75%
E
10
00
0
7xWeO
C
100
50%
F
100
0.02
75
0
0 04
30
-e
IxtetO
95%
0.01
25
0-02
0
20
40
Cytoplasmic mRNA
count per cell
00
75
0.27xteto
95%
50
25
50
AL
5
,25
75%
Cytoplasmic mRNA
Activation
count per cell
Activation
Varianct betwteen subpopulations
S G(2 1
ON at 1 loci
S/G2/M. ON at 2 loci
Figure 4-5: Varjabihlitl ill growiiig Jpopulations apportionedto variance within and between cellcyicle and ON/'OFFsubpopulatiOns, sho w-inIg variance between ldoninates. (A) Growlying populaion
P%-> expiression distributionsfor fr'ur activatorle vel', showing the changig cntributionsof GI and
S>'G2/'1f OFF and ON subpopilations. (BC)Subpoplations contribution to total variance in
c: yroplasnic maR-A4 (xprossioin showing that variance between the subpopulations is substantial.
(D.EF) P-xoe) expression, as for (A.,B, C).
94
Breaking down the expression variability in near steady-state arrested cell populations
better represents the underlying transcription in the ON/OFF active states and informs our
choice of model. For G2/M-arrested cells, expression distributions show that the fraction of
cells in the OFF or ON state changes dramatically with activator level, but how the mean
and shape of the OFF or ON distribution is unclear (Figure 4-6A&D). Accounting for
subpopulations with 1 or 2 active loci reveals that noise between the ON and OFF
populations dominates, contributing 25-50% of variance from P1 ,teo and P 7, 9,o expression
(Figure 4-6C&F).
A
L
1
E
D
B
xteto
-
0.04
1xtetO
7xteto
1%
_0
z
0.02
.
SxtetO
005
1xteto
7xteto
13% b
0.04
13%
0.02
50%C10
E
0
0.051te
100
0.0
0
2
1xteto
C0
5%
40
100
0.15g
S0
0.02i00
20
40
Cytoplasmic mRNA
count per cell
0
25
50
75%
0.04
5
0
5
7XtetO
95%
0.02
25
50
25
0
40
80
Cytoplasmic mRNA
count per cell
75%
Activation
50
00
25
50
75%
Activation
Variance between subpopulations
G2I1 arrest, OF
G2/M arrest, ON at 1 loci
G2/M arrest, ON at 2 loci
Figure4-6: Variabilityin G2/M1-arrestedpopulationsapportionedto variance within and betw een
ON/IOFFsubpopulations.(A) P ,Io expression distributionsshow OFF and ON subpopulations are
tighter than hill distribution. (B, C) Even within a cell-cycle phase, much of the variancederives from
expression. as
variation between, rather than within, ON and OFF subpopulations. (DEF)P
fir (AB,C). Variance within the subpopulations comprises a higher fraction of total variance
compared with Pw(,'o,. but variation between the ON and OFFpopulationsis still high.
95
Cells arrested in G1 have only one gene loci and so are the closest direct reflection of
dynamics at a single loci (Figure 4-7). At most activation levels, variance in the OFF
subpopulation dominates, partly because it is the largest subpopulation. Comparison of the
normalized variance for Pixteo and P 7Ao shows that P7xtet() has higher intrinsic variability,
by the same 2-fold ratio over PixtetO in the original growing population Fano factor. But the
Fano factor of these subpopulations is much smaller, around 2 and 4, which begins to reveal
the true dynamics at the promoter and informs our choice of model for the next section.
A
Ixteto
1%
0.04
B
D
Ixeo 0,08
3
z
E
1%
!c
7xe
Z
0.04
E
20
008
.
3%I
C0
0.08
2
13%
10~t
0.04
50
C.0
0.04
L
70
0
75%
1xffi C
50
75%
75
1xteto
50
0
95%
20
40
Cytoplasmic mRNA
count per cell
0
50
47xteto
0.04
95%
0.02
25
OM2
00
75%
0.04
o0
0.04
50
F
7xt00
5n% 100
0.08
2
25
25
50
25
0
75%
Activation
80
40
Cytoplasmic mRNA
count per cell
0
25
Activation
Sariance between subpopulations
GI arrest. ON at 1 loci
Figure 4-7: As for Fiuare 4-6, but for expression under Gl arrest. which is from a single loci.
(BE) Variance from active loci is relatively small compared with variance rom the original total
grouwing population. (CF) Variance between the OFFand ON states is up to 5M% of total variance.
Taken together, this analysis shows that variability between subpopulations dominates
the total noise in expression and the variability within a single state is surprisingly low. But
a model is useful to characterize the underlying transcription, beyond the simple statistics
of the resulting expression distribution. In addition, we have not been able to account for
the complication of expression from 1 or 2 active loci in S/G2/M conditions nor to
understand the kinetics of transitioning between cell-cycle stage expression levels in growing
populations. This motivates a model of transcription dynamics in each cell-cycle stage.
96
4.3.4
A model of transcription underlying arrested expression distributions
reveals that activator only regulates the probability and stability of activity,
whereas the promoter determines active transcription dynamics
A model of transcription at the tetO promoter in G1 and S/G2/M will provide insight
into the activator's mechanism of regulating transcription, the dynamics of transcription,
and enable us to link these insights to the expression observed in growing populations.
Analysis of nuclear mRNA has shown the general dynamics of transcription from the tetO
promoter across the cell-cycle are activation events in S/G2/M and in G1 to an ON state
that lasts throughout a cell-cycle stage. Thus the metrics for characterizing transcription
are the likelihood of activation and the expression from an active loci. Although cytoplasmic
mRNA is the functional product of transcription, we have described how nuclear mRNA is
a more direct readout of transcription. Thus the model is based on the presence and intensity
of nuclear mRNA spots.
The nuclear mRNA, rather than cytoplasmic mRNA, expression distribution in Giarrested cells is the most direct readout of transcription dynamics from an active loci. Cells
in G2/M arrest contain two loci and we cannot distinguish whether one or two are ON. This
presents a challenge to separating activation likelihood and active expression in S/G2/M,
which we overcome later in the model development. Thus we compare the nuclear expression
distributions in GI arrested cells with a single active loci to see how it depends on activator
level and binding site number. These distributions are overlaid in Figure 4-8A. Expression
is unaffected by activator level, but differs for PIxte
1
o and P7Xt(eto. Thus active expression is a
function of promoter only. To gain intuition about the transcription dynamics underlying
the expression and in order to develop a predictive model, we parameterize the distribution.
(We incorporate data from all activator levels, weighted by the fraction of active cells at
each activator level to account for sampling error in distributions constructed from fewer
cells.) Thus we seek a model of transcription that produces an expression distribution that
fits the Pixteo and P7xtetO Gi-arrest nuclear expression distribution.
The Pi1 ,eo expression distribution has a Fano factor close to 1 mRNA, suggesting low
variability in the transcription rate. (We've previously shown that transcription producing
mRNA with a constant rate and first-order mRNA degradation result in a Poisson
97
distribution, which has a Fano factor of 1 mRNA.) The distribution is well-fit by a Poisson
distribution with rate of 7 mRNA produced per nuclear mRNA lifetime (overlaid on the
data in Figure 4-8A). Constant transcription from the active Pixteto promoter is surprising:
This is recognized as a "noisy" gene, yet once active it has minimal variability in the timing
of transcription events.
The P 7 ,teo expression distribution has higher variance. Thus active P 7 xteto promoters
must be transcribed with a range of transcription rates. We cannot detect whether the
transcription rate at a single promoter fluctuates quickly (much faster than a cell-cycle
stage) or whether a single promoter maintains a particular transcription rate for an entire
cell-cycle stage with variation in the rates between promoters. But the lifetime of nuclear
mRNA at the active loci (-10 minutes) sets a "joint distribution" on the range and rate of
switching transcription rates. For the former option, a single promoter must sample greatly
differing (>> 10-fold) transcription rates during a cell-cycle stage, which are "averaged" by
the nuclear mRNA at the loci. In this case, the expression variance still comes from some
cells sampling a higher transcription rate than others "on average" throughout the cell-cycle
stage. But for the latter option, the variance of the expression distribution will give the
range of transcription rates. Because we cannot easily distinguish between the two, and
functionally they are similar, we create a model reflecting the latter situation. But we could
use that model to extrapolate and consider the rate and range of transcription rates sampled
by a single promoter.
Thus we model expression with a range of transcription rates, ignoring fluctuations
between rates. This could be done by simply fitting the expression distribution to an
empirical curve that represents the probability of a given rate of transcription (this is
equivalent
to
breaking
down
the
super-Poisson
distribution
into
infinite
Poisson
distributions). But we can gain more intuition if we fit the distribution to a finite number
of Poisson distributions, because that gives a simple indication of the range of transcription
rates. We choose transcription rates that are integer multiples of the Pixteto rate, inspired by
the fact that the left-side of the P7 xteto expression distribution is well-fit by a Poisson
distribution with 7 mRNA produced per the nuclear mRNA degradation rate, like PixtetO.
This suggests that the minimum transcription rate from an active P 7xtetO loci occurs when it
98
behaves like P1
tO,
perhaps having only one of seven binding sites occupied. To choose how
many Poisson distributions to use to parameterize the expression distribution, we fit the
distribution and then compare the squared-error. One Poisson distribution cannot capture
the super-Poisson variance. The maximum mRNA count is around 60 mRNA, suggesting
that -10 integer-multiple Poisson distributions will capture all of the variability. But this
unnecessarily introduces too many variables: most of the Poisson distributions would
contribute little to the total expression. Therefore we compare the goodness-of-fit for
multiple Poisson distributions between 1 and 10 and choose 3 Poisson distributions as a
balance of capturing the shape (and variance) of the distribution with fewest parameters.
Table 4-1 shows the contributions of the multiple Poisson distributions and the least-squareerror. Thus we parameterize transcription from the active P 7,teto loci with a rate of 7 mRNA
(20% probability), 14 mRNA (50% probability) or 21 mRNA (30% probability) per nuclear
mRNA degradation. As discussed above with the choice of the form of the model, these are
likely not independent integer-multiple states, but serve as a representation of the spread of
activity states. An additional piece of evidence (data not shown) is that the correlation
between mRNA level and cell size is no greater for active cells with P7 xtetO than Pxteto
expression, indicating that the P 7 ,teto activity state is not dictated by size-correlated global
cell activity. The Gi-arrest nuclear expression distributions are overlaid with the model in
Figure 4-8A.
Tahle 4-1: Alodol selection for active expression distributions
7xtetO
IxtetO
Sum
Parameters of best fit
A
1 ON State
2 ON States
3
State 2
7 111NA
X=7 mRNA
2-fold, 35%
= 7 mRNA
Sum Sq
Err.
Err.
1.8e-4
State 2=2-fold, 60%
0.4e-4
Sq.
Sum q.
Err.
0.5(- I
0.5e-4
-
Parameters of best fit
SItte 2
St~
e
State
State 3
State 4
ON States
4 ON States
* Fit is for nuclear mRNA in active arrestedcells.
99
= 2-fold, 0
:Ifo1. 30
-Si__
0.(A4
_
2 2-fold, 40%
= 3-fold, 15%
= 4-fold, 15%
0.1e-4
Next, we aim to parameterize the other metric of active transcription: the likelihood of
activation as a function of cell-cycle stage, activator level for Pixteto and P 7x,,,O. But to do
this, we need to first determine whether active cells in G2/M arrest have 1 or 2 active loci.
But we can solve this with the parameterized the active loci expression distribution, using
G1 arrest expression distributions (above). The G2/M expression distributions comprise
cells with 1 or 2 active loci, each transcribed according to the dynamics of GI arrest. (This
is actually an assumption that proves true when fitting the G2/M distributions.) Thus the
G2/M arrest expression distribution at each activator level is fit by a weighted combination
of the expression distributions from a single active loci and two-fold that rate. This is shown
for four activator levels in Figure 4-8B. The weighting that gives the best fit is concluded
to be the fraction of cells with 1 or 2 active loci.
Now we have the probability (or fraction) of a G2/M-arrested cell having 0, 1 or 2 active
loci, for each activator level, for Pixteto and
P7 xtetO.
Our first observation is that Pixtet 0 and
P 7xtto activation likelihoods are the same when the "effective" activator level is doubled for
P 7 xtto,
to scale by mean expression, just as it was for the activator level threshold for the
presence/absence of GI activity. The 2-fold increase in "effective" activator level likely results
from multiple binding opportunities at the sites closest to the promoter, but we don't
consider its origin here. So we combine P1 xeo and
P 7 xteto
data on this adjusted activation
scale. This gives a curve for the fraction of cells with 0 (light grey), 1 (medium grey) or 2
(black) active loci (Figure 4-8C). These were parameterized by a binding curve for two loci
with an additional parameter to capture any correlation between the two loci's activation:
4-2:
2("
PG2,1Loci
2
1+2("+W()
W(I2
PG2,2Loci =
2,
where A'=A+Aasai, best fit by w = 25, K = 1.9
1+2 ('+W(A
PG2,AnyLoci = PG2,1Loci +
PG2,2Loci
These fits are shown in Figure 4-8C (grey, black; and red background). The positive value
for w indicates positive correlation between activation of mother and daughter copies of the
gene. These probabilities of 1 or 2 active loci are combined to give the likelihood that any
given loci is active (Figure 4-8C, white; Table 4-2). These 1-, 2- and Any loci activation
100
probabilities quantify the Pearson correlation coefficient for the correlation between mother
and daughter gene copies:
POFF = (1
4-3:
-
P) 2 + aP(1
p) 2 (1
P1ON =
2P(_
-
P20N =
p2
aP(1
+
-
-
P),
-
P)
a)
where P
= PG2,AnyLoci
Both loci are active more often than expected, indicating extrinsic correlation in activation.
On average, the Pearson correlation coefficient a = 0.5, which matches the correlation
between activity at two homologous loci of a diploid observed in the previous chapter (Zopf
et a]., 2013). The observed basal likelihood of activation in the absence of any activator is
written directly into the binding curve, but has a small effect on greater-than-basal
expression.
The case for G1 activity likelihood is simpler: There is only one loci so P(Loci ON) is
directly observable, and plotted in Figure 4-8D, for Pxtet() and
P 7xtetO
together. This is also
fit to a simple binding curve, including a specified minimum threshold for activation (Table
4-2):
4-4:
PG1 =(,
where A'= A-Athres, best fit by K = 0.48
But our understanding of transcription across the cell-cycle is that S/G2 provides the
window for activation whereas M/G1 is a window of heightened likelihood of turning OFF.
Thus the probability of staying ON characterizes the process better than the probability of
being ON at all. This is shown in Figure 4-8E. Interestingly, the curve is sharp and almost
linear. The threshold for activity in G1 appears to just be the activator below which all cells
have turned OFF by GI (rather than an abrupt change in activator behavior). This also
suggests that activator levels control the stability of the active complex. (Activator levels
appear to control the stability of the active complex in long-term G2/M arrest too, but with
lower inactivation rates than G1 (Figure 4-2).) Thus either the M/G1 transition is a window
of increased likelihood of inactivation or the GI environment is more conducive to
inactivation than G2/M. Given that the M/G1 transition involves extensive chromatin
remodeling, we favor the former.
101
A
ixteto
7xtetO
0
0
B
10
0,1.
0
20
~oi
0
----
10
M4
0.i
0
20
0-
0
30
20
60
40
b
JModel
t
bata
20
40
60
20
40
60
20
40
60
20
40
60
05.
.
0
---
-4
0
0
.0.05
.
O05..........
.
01
0
30
0
0
10
20
30
0
10
20
30
0
005
~0A
K
0,05
0
10
20
30
0
Nuclear mRNA Count
C
D
0 IxtetO A 7xtetO
100%
0350%
10
050%
C-C-
0
50
100
Activation
150
200%
T
50
100
Activation
150
200%
E
100%
rf-
G2, Any Loci ON
G2/M, 1 Loci ON
G2/M, 2 Loci ON
GI, OFF
Z50%
0
G1,ON
so
P(ON) in S/G2
T00%
Figure 4-8: Development of a model of tetO transcriptiondynamics. (A) The active-loci nuclear
mRNA expression distribution is fit to expression in active Gl-arrestedcells. The boxplot is a visual
aid of the data (blue) to model (black) fit. (B) Active G2/M-arrested cells' expression (red) 1s a
combination of the active-loci distribution and the two-active-loci distribution (thin black lines),
weigh ted by the fraction of active cells with 1 or 2 active loci and summed (thick black line). Each
activator level's expression distribution is fit to find the best-fit fraction of cells with 1 or 2 active
loci. The boxplot is a visual aid of the data (red) to model (black) fit. (C) The model fit from (B)
gives the fraction of cells with 0, 1 or 2 loci ON as a function of activator level for Pi 0O and P7O.
boci ON (white). (D) The fraction of cells ON in Gl is observed directly
of all
iction
This gives the
from experimental rcsults, (E) but the probability of staying ON after G271M may better reflect the
situation,
102
Table 4-2: Sumnmary of the model of arrested tetO transciptiondynamins
Gi
G2/M
2(A
PG2,1Loci =
1)+w(
,
PG1
PG2,2Loci = 1+2
Probability of
activity:
2
T)
1+2
+
PG2,AnyLoci = PG2,1Loci
2
+ PG2,2Loci
25
w
K = 190% (or 95% for
A'= A + Ahasal,
2
Abasai =.0
units)
P 7 xtetO
for Pi.teto
K 48% (or 24% for P7 xtetO units)
A' = A - Athres,
Athres =0.4 (or 0.2 for P 7 xtet units)
0
dM
dN
Nuclear mRNA: - = p - 8NN,
Pixteto:
Active-loci
transcription:
P 7 xteto:
-
= 2p - 6MM
mRNA
pt = 7
p
Cyto. mRNA:
(1 *7,... P =.2
= 2 * 7, ... P = .5
3 *7,... P = .3
1/(10 minutes)
N
kExport =
m=
kDegradation
=
1/(20
minutes)
The final assumption to complete the full model of transcriptional dynamics is that there
is no transcription from the inactive state. The expression observed in OFF cells does suggest
some low-level basal activity, that may even be activator-dependent, but its contribution to
overall expression is low and so we choose to ignore it. This model of tetO transcription
dynamics is summarized in Table 4-2.
The steady-state model under-predicts variability in cytoplasmic mRNA expression,
which is higher than variability in nuclear mRNA. This indicates that mRNA and processing
and export introduce noise, most clearly evidenced by the Fano factor of nuclear versus
cytoplasmic
mRNA
expression
in
active
cells.
Plxteto
active-cell
mRNA
expression
distributions have a Fano factor of 1-2 for nuclear mRNA and 3-4 for cytoplasmic mRNA.
(For P 7xtetO, the Fano factor is 4-5 for nuclear and 6-8 for cytoplasmic mRNA.) Interestingly,
this increase in noise from nuclear to cytoplasmic mRNA comes with increased correlation
between cell size and mRNA count (Pearson correlation coefficient, p
and p
-
-
0 for nuclear mRNA
25% for cytoplasmic mRNA). Thus extrinsic variability in global cell activity may
affect mRNA processing and export, explaining the additional noise in expression. Another
103
limitation of the model is that it doesn't attempt to capture outlying, extremely intense (>80
mRNA) nuclear spots, which we observed in about 5% of
P 7xtet 0
cells in S/G2/M and G2/M
arrest. Ignoring these cases is reasonable because they do not carry over to cytoplasmic
mRNA. They may represent an overabundance of aborted mRNA transcripts (Rondon et
al., 2009) or a bottleneck of mRNA processing or export.
104
4.3.5
Reproducing cell-cycle kinetics with a model of S/G2/M and G1
stationary transcription dynamics
We use this model of transcriptional dynamics within G1 and S/G2/M to compare
arrested expression to growing populations transitioning between cell-cycle stages. The two
key differences in growing populations which will be captured by introducing cell-cycle
kinetics into the model are that mRNA expression contains the history of previous cell-cycle
stages and cytoplasmic mRNA is delayed by mRNA processing and export. Cells switch
between the simple Poisson dynamics and OFF states according to the likelihood of activity
as they cycle between S/G2/M and G1. Four assumptions that we make are: (1) We model
activity state across the G1/S transition as memoryless, i.e. all cells have equivalent
likelihood of activation in S, regardless of G1 activity. (2) Activation occurs probabilistically
at a single time-point at the beginning of S/G2. (3) Steady-state nuclear mRNA expression
is established immediately, whereas cytoplasmic mRNA is produced by export of nuclear
mRNA. (4) Cells retain a 50% binomial sample of their mRNA during mitosis and may turn
OFF into G1, but not ON from an OFF state. Assumptions (2) and (3) are evaluated by ,
then evaluate by comparing the model prediction to growing data. (Summarized in diagram
of Figure 4-9.) The model is limited by uncertainty in mRNA degradation. While
degradation at short timescales follows simple first-order degradation with a half-life of 20
minutes, there is a long-tail of degradation better described by half-lives around 40 minutes
(Appendix 6.3.1). We implement the model numerically with a finite Markov chain model.
o
<
Active fraction
1 or 2 loci are active,
s/
according to Loci
activation likelihood &
mother/ daughter
correlation.
G2/
M
i 1
Active cells stay ON
with G1/G2 activity
likelihood.
G
2
Nuclear mRNA
Immediately at
steady-state
distribution from
Poisson production
and degradation rate.
Cytoplasmic
Evolves by Poisson
production and
degradation (Finite
Markov method).
Nuclear and cytoplasmic mRNA is halved
by binomial sampling.
Remains at steadystate.
(3
Continues to
evolve with new
activity state.
Simulate 3-4 cell cycles, until stationary.
Figure4-9: DiagTain suwnmnarizing the model of cell-cycle dependent tetO transcription.
105
Figure 4-10 shows the results for each promoter at four activation levels, for nuclear and
cytoplasmic mRNA. Expression distributions are simplified to quartiles. The model captures
two qualitative aspects of growing populations very well, with surprisingly good quantitative
agreement (Figure 4-10). First, it indicates that cells growing in G1 have the same activity
as steady-state GI arrest, with extra expression levels simply carried over from previous
S/G2/M activity (Figure 4-10, blue). This suggests cells at low and intermediate expression
turn OFF to GI levels at or soon after the M/G1 transition. The range of uncertainty in
mRNA degradation leaves open the possibility that cells turn OFF throughout G1, but if so
it has little effect on expression. Second, the model captures kinetics of nuclear and
cytoplasmic mRNA within S/G2/M remarkably well (Figure 4-10, yellow-to-orange).
Nuclear
mRNA is approximately
constant
during S/G2/M,
supporting the
model
assumption it reaches steady-state expression immediately after activation. On the other
hand, cytoplasmic mRNA usually rises from early, to mid, to late S/G2/M, with good
agreement to the model, which assumed "production" at the rate of nuclear mRNA export
into the pool of mRNA leftover from GI. This supports the idea that activation occurs at
the beginning of S/G2, rather than the alternative that cytoplasmic mRNA increases
because activation occurs throughout S/G2/M.
106
A
lxtetO
1%
Nue.
40
lxtetO
13%
Nue.
ixtetO
95%
Nue.
IxtetO
50%
-
Nue.
20
0 S
B
-
G2
M
-
-
S S
GI
G2
M
ixtetO
20
GI
S
S
G2
M
G2
M
G1
S
1xteto
95%
50%
Cyto.
13%
Cyto.
Cyto.
S S
1xteto
ixtetO
1%,
G1
CytoM
10
S
G2
M
G1
S
S
G2
M
G2
M
GI
s S
G2
M
7xtetO
50%
Nue
7xtetO
13% *
Nue.
7xtetO
1%
Nuc.
z 80
s S
GI
G1
s
7xtetO
Nuc-
40
S
D
G2
M
Gi
S S
G2
M
s S
G1
G2
M
7xtetO
7xtetO
40
13%7 "
Cyto.
Cyto.
G1
s S
G2
M
s
G1
7xteto
50%,
7xtetO
Cyto.
Cyto.
95%-
20
0
S
G2
M
G1
S
UZ
M
Cell cycle
S
" I
G2
Activation
M
-
G1
S S
G2
M
G1
s
.-
ModeL
Median
Data:
2 5 th, 7 5 th percentile
Median -/+ 29, 7ffl perentile...
S/G2/M, mid
S/G2/M, late
01
Figure 4-10: The gene regulatoryfunction predicts temporal expression in growing populations.
Growing data (colored) versus model (black/grey)predictionfor (A) P,, o nuclearmRNA expression,
nuclear iRNA expression, (D) P-,0O
(B) P,)O cytoplasmnic mRNA expression, (C) P,,o
cytoplasmic mRNA expression, each at four activatorlevels (1%. 13%, 50%, 95%). Quartiles (25%/
50%/75%) of experimental data are shown as colored boxplots. The model median is a black line
and model quartilesare a grey shaded area. Nuclear mRNA reaches "steady-state"early in S/G2 and
is fairly stable across S/G2/Al (yellow-to-orange,A, C). Cytoplasmic mnRNA increaseacross S/G2/AM
is captured by the model (yellow-to-orange,BD). Modelprediction of GI expression agrees with data
within the uncertainty of mRNA degradation (blue).
107
4.3.6
Cell-cycle dependent transcription at the yeast PHO5 gene suggests
generality among regulated genes in yeast
Our observations of cell-cycle driven fluctuations in transcription at the tetO promoter
lead us to ask whether this is the case at any other yeast genes. We choose to investigate
expression from the native yeast PHO5 gene, which has been actively studied for its noise
properties (Mao et al., 2010; Raser & O'Shea, 2004). Like tetO, expression from the PHO5
promoter (PpHo 5) is titrated by changing the level of its activator, Pho4p. We observe the
fraction of active cells and the cytoplasmic mRNA count of a growing population segregated
into cell-cycle stages. We infer underlying G1 transcription rates using the mRNA
degradation rate (Equation 4-5), supported by the agreement of arrested and growing tetO
expression data. Remarkably, PpH5 transcription shares the same cell-cycle dependence as
tetO: basal expression is restricted to SG2/M, and G1 expression increases towards S/G2/M
levels as activator level increases (Figure 4-11). Unlike tetO, there is less separation between
cytoplasmic mRNA levels in cells with and without nuclear mRNA (on average, OFF
subpopulation mean expression is only 20% less than ON subpopulation mean expression,
compared with 80% for P7 m 0o and Pteto). This suggests that switching at the PHO5 gene
may occur on faster timescales then tetO, independent of cell-cycle activation. The PHO5
gene system has been studied extensively, yet this trend has not previously been appreciated.
A 211tGI
Mss = Mfci -
A
_
__
B__
-Et=1 (-,tc
2
4-5
_
B100%
_
z
0
_
CG2/=
2x GI
Growing late G2/M cel
40
G2/M= 20
0
Growing GI cel
=25%
0
25%
GI
50%
-
0%
x
50%
75%
Activator level
~0%0
IGI
0
inferred from growing data
25%
50%
75%
Activator level
Figure4-11: Cell-cycle dependent transcrjptionfrom the PHO5promoter. Expression is measured
in grwing cells segregatedinto G1 (light blue) and late G2Al (orange) cell-cycle stages. Under-ving
Gi transcrptio
n is inferred and stationazy expression is shown (blue). (A) Mean of cytoplasmic
mBRNA expression distributionsin each cell-cycle condition for a titration of Pho4p activator levels.
Like the tetO promoter, all basal expression occurs in G2. Gl expression occurs above an activation
threshold, then approaches G2/ , expression. (B) The ratio of cells active in G2/Al versus GI shows
the same trends.
108
Discussion:
4.4
4.4.1
Naive interpretation of expression distributions with the bursting
transcription model
We have shown that cell-cycle and other extrinsic sources of noise dominate expression
variability. It follows that assigning all noise to stochastic transcription must overestimate
the noise, or "burstiness", of transcription at the gene loci.
Figure 4-12A shows expression of a growing population, which is well-described by the
negative binomial distribution solution to the stationary "bursting" model. According to the
this model solution, the parameters of the negative binomial fit are the Frequency and
1/(1+Size) of bursts respectively. We fit and extract these parameters for each condition
(Figure 4-12A is one example) and plot the parametric curve of burst size versus burst
frequency in Figure 4-12B. All three promoters (ixtetO, 7xtetO, PHO5) show a biphasic
trend in the "burst size", that apparently increases and then plateaus as activator level
increases. The apparent burst size at
P 7 xteto
is about 2-fold higher than at PIxtto. (Note: This
negative binomial fit is to mRNA count per cell, where some cells are in S/G2/M with two
loci.)
109
A
B
1xtetO 5%
Neg. bin. fit.
OA
0.4
1xtetO
301
PHO5
C)
34 0-3
0-2
m~
201
10
0.1
0
0
10
20
0'
(
30
1
2
3
4
Burst frequency
Burst frequency
C
D
P(Nli C' - PoNm G1
1 xtetO
3
30
PO* m 621
1 xtetO -
20
PHO!
1 -1
PHO5"
POn
A
2
10
C,
62 -
1u
JI
096
50% 100% 150%
Activation
200%
0%
50% 100% 150% 200%
Activation
Figure4-12: An explanation of the dynamics inferred bl the transcriptionalbursting model. (A)
Growing (and arrested,not shown) distributions are well-fit by the negative binomial distribution
solution to the stationary bursting transcriptionmodel. (B) For each of the three genes studied here,
the model interpretsa b/phasicmode ofregulation. where activatorincreases expression first via burst
size and then burst frequency (trend lines are added only to aid the eye). (C) The model predicts
burst frequency increases linearly with activatorlevel. (Pro ,s
plotted against2-fold activatorlevel.
such that Pj,(o and P,lo reduce to the same activation curve.) (D) The model predicts burst size
increases and then plateaus. This trend is mimicked by the diference between Gi and S/G2/M
activity, which peaks at the sane point as apparent"burstsize" (light grey dotted line). But arrested
populationsshare this trend. The trend is actually caused by the variation between cells that activate
or not in S/G2, which peaks whein 50c of cells are OFF or ON (dark grer dotted line).
110
But expression clearly does not follow stationary transcriptional bursting, because
expression is markedly different between cell-cycle stages. The burst size inferred here, of
about 10 mRNA for Pi,, eo0 and about 20 mRNA for
P 7 xtetO,
represents some combination of
the 3-4 mRNA and 5-7 cytoplasmic mRNA Fano factor observed for Pitco and P7
tXtO active
loci plus superposition of inactive and active cells in Gi and G2.
But the biphasic trend in the apparent "burst size" is interesting, and we consider its
underlying cause (Figure 4-12D). One possibility is that at absolute basal levels, active
transcription does not follow the dynamics of higher active levels. Our data cannot determine
this because the active events are so rare and thus poorly sampled by the mRNA FISH
technique. In theory, a high throughput method for counting single mRNA molecules could
determine if basal activity shares the expression distribution of higher levels. We
hypothesized that apparent "burst size", or growing population Fano factor, peaks at the
biggest different between the fraction of active cells in Gi and S/G2/M (Figure 4-12D, lightgrey). This combination of mostly-ON S/G2/M cells and mostly-OFF GI cells might create
the highest variability. This correlates well, but proves not to be the cause, partly because
even if all cells are OFF in Gi, they carry mRNA from the previous G2/M for much of G1.
Instead, the biphasic trend in burst size appears to track the variability in S/G2 activation
likelihood, such that the Fano factor (normalized variance) peaks when there is a 50% chance
of a cell being OFF or ON (Figure 4-12D, dark-grey; activity likelihoods from Figure 4-8D).
Beyond this specific connection, the fact that the expression distribution fit the negative
binomial solution, despite not adhering to the bursting model, is a cautionary example for
using models to explain data.
111
4.4.2
Reconsidering cis and trans modes of regulating transcriptional
dynamics in yeast
Our findings here give grounds to reconsider previous understanding of how cis and trans
elements modulate expression noise through transcriptional bursting. Results of studies
varying cis and trans elements and inferring how they modulate dynamics were listed in
Table 1-1.
Multiple activator binding sites are expected to increase burst size (Raser & O'Shea,
2004; To & Maheshri, 2010). In this study, adding multiple binding sites, as
P 7xtetO
vs
PlxtetO,
affected expression from an active promoter, apparently supporting multiple higher activity
states, long-lived compared to mRNA lifetime. In this way, multiple binding sites increase
expression noise for a given mean (which appears as increased "burst size") through the
variability of expression at the promoter. The multiple binding sites also affect the mapping
of activation likelihood to activator level, which may appear as modulation of the "burst
frequency" at a given activator level. But when scaled by mean expression, we saw the
relationship reduce to a single binding response curve, effectively "decoupling" the two effects
of binding site number. Binding site strength is also expected to affect burst size. It may
also modulate the likelihood of activation in S/G2, via the same mechanism as multiple
binding sites. But binding site strength likely also regulates the stability of the ON state,
modulating its duration in a fairly close approximation of "burst size regulation".
TATA strength is known to modulate expression noise alongside mean, also appearing
as burst size regulation (Mogno et aL, 2010; Raj et a]., 2006; Raser & O'Shea, 2004). The
CYC1 promoter within the tetO promoter used in this study has one strong and three weak
TATA binding sites. But we observed Poisson-like expression at the active Pitto promoter.
So how could ablation of the TATA further reduce noise? Our hypothesis is informed by
T.L. To's previous measurement in our lab that a growing population expressing from PIXt(to
with an ablated TATA site has an apparent burst size of 1-2 and a frequency changing with
activator level. This low Fano factor may derive from activation with similar likelihoods but
a, much lowered productivity in the active state (i.e. ~1 mRNA produced per mRNA lifetime
112
instead of 7 mRNA measured here). This would mean that ON cells are not highly
differentiated from OFF cells, keeping the apparent burst size low. The alternative scenario
of cells activating with less likelihood in S/G2, or cells activating but then turning OFF
very soon after, would be expected to result in a growing population with a higher Fano
factor. But the actuality may be a combination of both scenarios.
Chromatin remodeling prior to activation of transcription is also expected to increase
burst size. It appears that once chromatin is removed, the activator can stay primed for
multiple rounds of reinitiation in one "burst" of transcription. In this study, we investigated
cell-cycle as a global regulator of transcription at the PHO5promoter. The inactive PHO5
gene is occluded by three well-positioned nucleosomes, on the TATA box and a highaffinity Pho4p binding site; a low-affinity Pho4p binding site is exposed between the
nucleosomes (Vogel, Horz & Hinnen, 1989) (Figure 4-13). Pho4p binds to the exposed
low-affinity binding site, which instigates chromatin remodeling and disassembly
(Svaren & Horz, 1997), which exposes the high-affinity binding site and TATA box and
enables active transcription.
PHO5
Nucleosome binding sites
Figure 4-13: The PHO5 gene. One strong activator binding site ('i) is occluded by a Vellpositionednucleosomne; one weak activatorbinding site ('L") is exposed. Pho4p binds the exposed site
to induce chromatin remodeling, allowing Pho4p binding of the high-affinity site, then activation of
PHO5 expression.
We saw basal transcription restricted to S/G2 and increasing activation in S/G2/M,
and then in G1, in response to increasing activator levels. But compared to the tetO
promoter, where chromatin remodeling plays a minor role, expression in active and
inactive states was less separated. This suggests that the promoter more quickly
transitioned between a state producing mRNA, resulting in a visible nuclear spot, and
a dormant state. This may indicate multiple physical states with varying activity and
chromatin occupancy. For example, the S/G2 window for activation may play the same
role for PPHo5 as PL/7xteto, enabling initial activation. But throughout the cell-cycle, an
activated promoter may fluctuate between states with one or more bound chromatin,
113
creating shorter-term fluctuations in transcription. Others have studied expression noise
at the PHO5 gene in response to a range of Pho4p activator mutants (Mao et al., 2010).
They compared three models of activation and concluded that Pho4p controls the rate
of nucleosome disassembly and then assembly of transcription machinery. This data may
well be confounded by the cell-cycle effects observed here. It may be that Pho4p controls
the likelihood of activation (during nucleosome disassembly of S/G2) and also then
controls the rate of fluctuations between the nucleosome-free states throughout the cellcycle. Other genes in the PHO family have different promoter architectures. Promoters
less-occluded by chromatin may mimic the tetO gene in the dominant separation of
ON/OFF states. Promoters occluded by chromatin but with only weak activator binding
sites may have only a short-lived period of activity following S/G2 activation. This could
be easily tested at a family of PHO5variants with our method and would give interesting
insight into chromatin regulation dynamics.
The role of activator level or function was least clear in Table 1-1. Activator appears to
predominantly regulate burst frequency, but also burst size in some cases. Here, we saw that
activator directly regulated the likelihood of activation in S/G2, which may correspond to
apparent burst frequency, and is in agreement with the basic understanding of chemical
kinetics. But it also appears to regulate the stability of the active complex across the M/G1
transition and during extended periods of G2. This could be tantamount to "burst size"
regulation because it modulates the amount of mRNA production in a single active period.
One unresolved question about our observations at the tetO promoter, and common to
any gene that follows a complicated scheme of activation, is how the overall response of
mean expression to activator level is linear. This seems surprising, given that the S/G2/M
activation curve increases rapidly before plateauing and GI transcription occurs only after
some activator threshold. It appears that these balance to give an overall linear response
curve. But whether a linear response is inherent to the mechanism of activation or
coincidence remains to be answered.
114
4.4.3
Predictions about cell-cycle as a global transcriptional regulator in
other organisms
Cell-cycle may prove to act as a global transcriptional regulator, to some extent, at all
yeast genes. But key differences between yeast and other organisms create doubt as to
whether this is so in other organisms. Nonetheless, our findings suggest other effects of cellcycle or chromatin remodeling in lower and higher organisms.
Bacteria do not share the same chromatin DNA packaging as eukaryotes, but bursting
has still been identified at some genes (Golding et a]., 2005; So et a]., 2011). This suggests
other factors regulate promoter activity state, which may or may not involve the cell-cycle.
But E. coi cells in fast growing conditions actually contains multiple genome copies, in
preparation for upcoming rounds of division. Thus expression must be the average of
transcription at each copy of a gene. This should be taken into account when inferring
dynamics from expression noise because the apparent, averaged size of transcriptional bursts
would dampen true burstiness or switching dynamics.
Mammalian cell cycles (10-20 hours) are slow compared with yeast cultures, and spend
a relatively small fraction of time in S/G2. Hence S/G2/M activity will make a relatively
small contribution to real-time expression or expression across a population, and most likely
doesn't substantially affect noise. But if the S/G2 window of activation observed here is
indeed linked to post-replication
chromatin remodeling,
there are implications for
mammalian cells. In this case, cell processes that govern chromatin remodeling events may
underlie all bursting transcription in mammalian cells. Also, cell volume alone is proving to
play a dominant role in global transcription activity at constitutive genes (Raj, 2013, data
not yet published). But what dominates variability in expression of regulated genes under
repressed conditions remains to be seen.
115
4.5
Conclusions
We found that tetO transcription dynamics are characterized by globally-correlated,
probabilistic S/G2 activation controlled by activator binding to its operator, and active
transcription with surprisingly low noise depending on promoter architecture. Beyond the
results presented here, these findings motivate revisiting studies that that ascribe all noise
to stochastic transcription. It suggests a causative link between DNA replication and
transcription under repressed conditions, with implications for biologically relevant cases of
fast-growing cells, such as cancer and development. A gene with cell-cycle dependent
transcription will perform differently in many network topologies, which should be
considered in synthetic biology design. Together, this presents a method for determining
transcriptional dynamics at a single promoter and quantifies the role of cell-cycle as general
regulator and source of noise in gene expression.
116
4.6
References
Cai, L., Friedman, N., & Xie, X.
S. (2006). Stochastic protein expression in individual cells
at the single molecule level. Nature, 440(7082), 358-362.
Chubb, J. R., Trcek, T., Shenoy, S. M., & Singer, R. H. (2006). Transcriptional pulsing of
a developmental gene. Current Biology, 16(10), 1018-1025.
Friedman, N., Cai, L., & Xie, X. S. (2006). Linking stochastic dynamics to population
distribution: An analytical framework of gene expression. Physical Review Letters,
97(16), 168302.
Golding, I., Paulsson, J., Zawilski, S. M., & Cox, E. C. (2005). Real-time kinetics of gene
activity in individual bacteria. Cell, 123(6), 1025-1036.
Hilfinger, A., & Paulsson, J. (2011). Separating intrinsic from extrinsic fluctuations in
dynamic biological systems. Proceedingsof the National Academy of Sciences,
108(29), 12167-12172.
Mao, C., Brown, C. R., Falkovskaia, E., Dong, S., Hrabeta-Robinson, E., Wenger, L., &
Boeger, H. (2010). Quantitative analysis of the transcription control mechanism. Mol
Syst Biol, 6
Mogno, I., Vallania, F., Mitra, R. D., & Cohen, B. A. (2010). TATA is a modular
component of synthetic promoters. Genome Research, 20(10), 1391-1397.
Raj, A., Peskin, C., Tranchina, D., Vargas, D., & Tyagi, S. (2006). Stochastic mRNA
synthesis in mammalian cells. PLoS Biol, 4(10), e309-e309.
Raj, A., & van Oudenaarden, A. (2009). Single-molecule approaches to stochastic gene
expression. Annual Review of Biophysics, 38(1), 255-270.
Raser, J. M., & O'Shea, E. K. (2004). Control of stochasticity in eukaryotic gene
expression. Science, 304(5678), 1811-1814.
117
Rondon, A. G., Mischo, H. E., Kawauchi, J., & Proudfoot, N. J. (2009). Fail-safe
transcriptional termination for protein-coding genes in S. cerevisiae. Molecular Cell,
36(1), 88-98.
So, L., Ghosh, A., Zong, C., Sepulveda, L. A., Segev, R., & Golding, I. (2011). General
properties of transcriptional time series in escherichia coli. Nature Genetics, 43(6),
554-560.
Suter, D. M., Molina, N., Gatfield, D., Schneider, K., Schibler, U., & Naef, F. (2011).
Mammalian genes are transcribed with widely different bursting kinetics. Science,
332(6028), 472-474.
Svaren, J., & Hbrz, W. (1997). Transcription factors vs nucleosomes: Regulation of the
PH05 promoter in yeast. Trends in BiochemicalSciences, 22(3), 93-97.
Taniguchi, Y., Choi, P. J., Li, G., Chen, H., Babu, M., Hearn, J., Xie, X. S. (2010).
Quantifying E. coli proteome and transcriptome with single-molecule sensitivity in
single cells. Science, 329(5991), 533-538.
To, T., & Maheshri, N. (2010). Noise can induce bimodality in positive transcriptional
feedback loops without bistability. Science, 327(5969), 1142-1145.
Vogel, K., Horz, W., & Hinnen, A. (1989). The two positively acting regulatory proteins
PHO2 and PHO4 physically interact with PHO5 upstream activation regions.
Molecular and Cellular Biology, 9(5), 2050-2057.
Zopf, C. J., Quinn, K., Zeidman, J., & Maheshri, N. (2013). Cell-cycle dependence of
transcription dominates noise in gene expression. PLoS Comput Biol, 9(7), e1003161.
118
CHAPTER 5.
5.1
Future directions
Synopsis
This thesis characterizes how cell cycle influences the transcriptional dynamics of
promoters not previously associated with cell-cycle dependent control in budding yeast. It
precisely and quantitatively defines to what extent this influence contributes to the noise in
gene expression. While the work has been limited to two promoters, and in budding yeast,
it leads to the exciting possibility that these particular cell-cycle dependent transcriptional
dynamics operate for the 20-30% of genes characterized as "noisy.- in S. cerevisiae (BarEven et a., 2006). Additional important questions remain regarding the origin, mechanism,
generality and consequences of cell-cycle dependent transcription. We discuss our hypotheses
pertaining to these questions, and identify existing and new approaches in testing these
hypotheses.
The origin of the S/G2 window for transcriptional activation
5.2
We have observed that low expressing PHO5 and tetO promoters activate only in early
S/G2 (Chapter 3). Given the importance of chromatin remodeling for efficient activation of
yeast
promoters, we have hypothesized that chromatin maturation following DNA
replication leaves otherwise chromatin-repressed genes open for activation.
The hypothesis is informed by current understanding of chromatin maturation. During
replication, old histones are recycled to new histones, apparently with random segregation
of groups of histones to leading and lagging strands. Our mRNA FISH
can detect no
evidence for or against bias towards mother or daughter turning on, however it is also
believed that chromosomes segregate between mother and daughter cells randomly (Keyes
et a., 2012).) This is likely to involve the Mcm2p helicase, which is bound to the replicating
strands and also binds free histones, acting as a transient docking site. But equal numbers
of histones must be newly synthesized and assembled on the nascent DNA. Production of
histones is rapid but tightly regulated. These are then shuttled to the nucleus via sequential
association with histone chaperones, undergoing post-translational modifications, to the
chromatin assembly factor CAF1 that mediates replication-coupled histone deposition
119
(Alabert & Groth, 2012; Annunziato, 2012). The nascent chromatin is highly acetylated and
thus open and sensitive to nuclease digestion, potentially creating a 'window of opportunity'
for not only DNA repair, but also transcription factor binding and transcriptional activation.
We theorize that this is the underlying cause for replication-linked transcription.
Maturation of nuclease-sensitive nascent chromatin to nuclease-resistant chromatin
similar to bulk chromatin in interphase takes 10 to 20 minutes (Alabert & Groth, 2012).
During this time, it undergoes rapid processing by chromatin modifying and remodeling
agents, often guided by interactions with the replication machinery, into a more compact
state. The PCNA clamp recruits several of these chromatin modifiers and is positioned to
ingrate chromatin assembly and maturation with replication and fork repair. PCNA
dynamics have been measured as highly stable, remaining on replicated DNA for up to 20
minutes (Moggs et a]., 2000). One hypothesis for our observation that sometimes only one
strand of newly replicated DNA is ON (based on nuclear spots from mRNA FISH) is that
discontinuous DNA synthesis on the lagging strand creates asymmetry in PCNA activity
and thus chromatin maturation. However the positive correlation in activation of a mother
and daughter chromosome suggests this effect is small, if at all, and there is similar chances
for activation of the chromosomes made from the leading and lagging strand. At any rate,
chromatin maturation primarily involves histone deacetylation and histone linker H1
binding, and thus may be disrupted or slowed by short-term treatment with histone
deacetylase (HDAC) inhibitors. (This brief summary of chromatin maturation draws from
reviews by Alabert & Groth (2012) and Annunziato (2012).)
Hence we hypothesize that DNA replication and associated chromatin maturation leads
to the window for transcription from otherwise repressed genes in early S/G2. Ideally, we
would test this hypothesis directly on a non-replicating promoter. Comparing expression
from a replicating and non-replicating promoter should establish whether replication is
necessary for repressed transcription. This could involve a non-replicating promoter
expressing a reporter from a centromeric plasmid whose replication origin is flanked by loxP
sequences, which can be excised to remove the replication sequence and create a stable nonreplicating plasmid.
120
Nonetheless, a more complete picture will come from knowing which factors are involved.
Our hypothesis predicts that removing factors involved in chromatin maturation will reduce
or enhance the cell-cycle dependent transcription. For example, addition of HDAC inhibitors
or knockout of HDAC genes should delay chromatin maturation and extend the window for
activation, perhaps enhancing expression activity from repressed genes. A set of candidate
genes whose deletion might affect cell-cycle dependent transcription are shown in Table 5-1
Such affects can be easily screened by mRNA FISH on arrested cells is the most efficient,
reliable and direct method to screen the knockouts for changes in cell-cycle dependent
transcription. Each knockout will be screened under GI
(alpha-factor) and G2/M
(nocodazole) arrest at high and low expression levels. The high expression case serves as a
control for global changes in expression; the low expression case is the test of whether the
knockout gene is involved in the window of opportunity for repressed transcription. A caveat
is that chromatin regulation is so central to eukaryotic cell function that knockouts are likely
to change global expression levels. In this case, the test result would be only a relative and
not absolute change in expression levels.
Table 5-1: A
list of gene knockouts for studying the origin of cell-cycle dependent transcription
Function
Chromatin remodeling
Details
SWI/SNF subunits, involved
in DNA replication and
Gene name
f SNF2, RTT102
transcription
Chromatin assembly factor
CAF-1, involved in chromatin
dynamics during transcription
Histone chaperone
Nucleosome spacing factor
CA C2, RLF2, MSI1
RTT106, VPS75
INO80
Nucleosome assembly factor
ASF-1
Histone
SAGA complex units, global
GCN5, HF1, SPT3
acetyltransferase
regulator
ADA complex units
HAT complex
Histone deacetylase
NuA4 complex
SAS complex, acetylases free
histones
Rpd3L/Rpd3S complex units
(GCN5,) AHC2
HIFi, HA TiI
RTT109
EAF1
SAS4
PHO2, CTI1, SIN, RPD3,
HOS2
BRE2, SET1, SET2
COMPASS units
Histone
methyltransferase
Cell cycle progression
Promotes G2 to M transition
CLB5
(Cherry et a1, 2012, Saccharomyces Genome Database)
121
5.3
The complete mechanism of cell-cycle
regulation of transcription
Our experiments and current literature support this picture of cell-cycle regulated
activation: A gene loci may turn ON in S/G2, with likelihood increasing with activator level.
Activation is somewhat correlated at the cell level (i.e. between homologous loci of a diploid)
and between mother and daughter loci. The activation state is maintained through G2 and
M, though its stability is limiting on longer timescales such that it may turn OFF during
extended G2/M, depending on activator levels. There is a stronger activator-dependent
likelihood of turning OFF over the M/G1 transition, and the new activation state is
maintained through G1. There is no indication for or against memory across the G1/S
transition, which would make S/G2 activation dependent on GI state.
Our quantitative analysis was enabled by the resolution of single-molecule nuclear and
cytoplasmic mRNA FISH under arrest and (CJ Zopf's) real-time protein tracking of steadystate expression and activation kinetics. But still many aspects are uncertain and others
were unobservable. Completing this picture is necessary for full understanding of the cellcycle's role in transcriptional activation, and I present suggestions for it here, summarized
in Figure 5-1.
Results
0
Future questions
Mother/daughter activation bias?
Activator-dependent S/G2
activation
Delay in
/
G/
cytoplasmic mRNA
G
due to export
M
Activator-dependent
inactivation at M/G1
Can activation occur outside
S/G2?
i
Activator-dependent
inactivation in extended G2
Is correlation global or at the
replicating chromosome?
Gc1
E
What is the timing of
inactivation?
Is history of activation carried
into S?
2
Figure 5-1: Summaryv of resuilts and open questions abouit the cell-cycle ti-anlsorptionpattern
122
Starting with S/G2 activation, is there a preference for activation of the mother gene
copy on the leading strand versus the nascent daughter on the lagging strand? This would
inform us of the mechanism of chromatin maturation following DNA replication. Because
mother/daughter chromosomes are randomly segregated to mother/daughter cells in mitosis,
this will require identifying which strand(s) are ON at the time of replication and activation.
This could involve direct visualization with single-molecule techniques, or ChIP or
fluorescence techniques to visualize key proteins at the replication and transcription site.
The correlation in activation of homologous loci in a diploid cell and between a mother
and daughter chromosome following replication is similar but the source is unknown. Do
both simply arise from global cell transcriptional capacity or activator level? The first test
for this is to hybridize probes to the activator simultaneously with the reporter and look for
correlated levels. Gene-independent, cell-wide expression correlation should then be probed
at other constitutive and regulated genes, in addition to mining already-available genomewide expression correlation data.
The S/G2 period appears to dominant activation, but can activation occur outside of it?
Could a highly-expressed or moderately-expressed gene activate in late G2/M or Gi? This
should be tested in slow growth conditions where cells spend most of their cycle in G1.
Measurements are taken before and after gene induction to test for GI activation. A caveat
of any study in slow growth conditions is that these lead to global changes in expression
which can cloud all results. This experiment may require several controls on global
expression or selection of a gene unaffected by growth conditions.
Further probing G1 activity, it is unclear how G2/M activation levels transition to G1.
Data indicates that cells may turn OFF over M/G1, but do not (or very rarely) turn ON.
But when does the inactivation occur and on what does it depend? For a gene promoter
with a range of activity levels when ON, such as Pax,
are less active genes more likely to
turn OFF? Further, is there an increase in these gene's activity prior to mitosis in
anticipation of dormant transcription during division, as has been identified at other genes?
These answers require advanced experimental techniques obtaining real-time data at
mRNA-level for the necessary resolution. The MS2 variant of mRNA FISH, where a gene is
123
engineered to bind multiple fluorescently-tagged MS2 proteins in vivo may be best, so long
as it the bulk of the engineered and tagged mRNA does not change activation dynamics.
Finally, we know little about the G1/S transition. Here too, real-time, precise mRNA
levels, perhaps obtainable with the MS2 method, or real-time visualization of a gene's
activation state, with fluorescently labelled transcription machinery, will answer whether
S/G2 activation carries GI history.
124
5.4
Prevalence of cell-cycle driven transcription
While the majority of work in this thesis was focused on synthetic tetO promoters, we
observed similar cell-cycle dependent expression at the native PHO5gene. A key feature of
this was at low expression, there was no transcriptional activity in G1. Moreover, related
work demonstrates that kinetic activation of the PHO5 promoter upon introduction of its
upstream transcription factor is strongly biased to occur in S/G2 (Zopf, Wren, Maheshri,
unpublished). We have suggested that this phenomenon may at least be present at the
highly regulable, noisy, TATA-box containing class of yeast promoters previously identified
(Newman et a]., 2006).
This could be tested by screening a set of well-studied regulated yeast promoters. The
most basic experiment will be measuring gene expression at high and low expression under
G2/M and G1 arrest. Ideally this could be done with RNA-seq, to measure all genes'
expression at once. But RNA-seq may not give clear results because expression is normalized
against total RNA levels. Experiments on one gene at a time could first screen at a bulk
population level, rather than single cell, to test for a much higher fold-change in G2:G1
expression at low versus high expression.
We do expect differences in cell-cycle dependent activation that depend on the nature
of the gene's promoter. An immediate example is that expression from PHO5 has much less
separation between cytoplasmic expression in cells with and without nuclear mRNA. This
is consistent with a less stable ON state, or switching between activity states. One can
imagine that a promoter with a complex sequence of initiation, involving removal of multiple
nucleosomes, could have multiple slow-steps for initiation, only some of which may be
dictated by the cell-cycle. The PHO5 system would be an excellent case study, given the
well-known characteristics of the promoter variants. More generally, this approach will
reveal elements of promoter architectures enriched among genes identified as subject to cellcycle driven transcription, or those that are not. Finally, observing the nature and effects of
cell-cycle regulated activation in several natural genes will allow us to answer whether it has
effects at the phenotypic level in biologically relevant regimes.
125
5.5
Towards a generalized, predictive model for stochastic transcription
dynamics
An eventual goal of biological research is to construct models that enable a priori
prediction of cell- and tissue-level behavior of genes and gene networks. Models such as the
regulatory function developed here for a single gene are a small step to that eventual goal.
A next step is to extend our gene regulatory function beyond its current domain, by
measuring expression under different cis and trans regulatory conditions. This work has
established the requirement for gene regulatory information for transcription with any cellcycle dependence: mRNA expression data under cell-cycle arrest.
Here we consider interesting directions to extend the domain of the gene regulatory
function. Additional binding sites seem to act like multiple activity states, causing higher
ON-state variability. To what extent does this trend continue with more binding sites and
when does it saturate? The TATA box is another cis element important to transcriptional
dynamics. T.L. To measured expression from Pixtet 0 with an ablated TATA box in our lab
and saw a reduction in population-wide noise, as have others at other genes (Hornung et al.,
2012; Mogno et al., 2010; Raser & O'Shea, 2004) . Seeing that the Plitcto promoter has
minimal noise when ON, a likely hypothesis is that the TATA box is necessary for stability
to maintain the active state through the cell-cycle stage. This could be tested on several
variants of TATA box strength.
The case of higher expression through activation is particularly interesting. Activation
at 100% in this study refers to maximum induction of tTA when expressed from the ADHi
gene s promoter on a centromeric plasmid. But tTA levels can be increased further using
other promoters or by placing it under positive autoregulation. Our lab has previously
measured that maximum activation gives expression 2-fold higher for Pxte,0 and 4-fold
higher for P 1xtcto, than the level of activation assigned to 100% in Chapter 4 (To & Maheshri,
2010). But our gene regulatory function allows only 60% and 20% further increase in
expression levels, through saturating the ON state. So where does the remaining expression
increase come from? One possibility is that further increase in tTA levels increases ON state
activity. Yet ON state activity is particularly unaffected by activator level for Ix. Therefore,
what is the tTA level at which ON state activity increases, and does this occur after the
126
probability of being in the ON state in any cell-cycle stage reaches 100%? Such questions
could be answered determined by measuring expression distributions in arrested cells
containing higher amounts of tTA, but caution is warranted as too high levels of tTA are
toxic to cells and have global effects on expression.
Repeating work suggested here and done earlier in the thesis on several gene classes'
promoter variants should form a foundation for generalized models of regulated stochastic
transcription dynamics and progress towards a prioridesign of gene networks.
127
5.6
Consequences of cell-cycle driven transcription in gene networks
As described in Chapters 1 & 2, noise in expression has consequences for behavior in
gene networks, and how an activator regulates transcriptional dynamics can affect the
network behavior. Most studies to date have focused on the consequences of variability from
transcription assuming bursting dynamics. This thesis work now leads to a whole new set
of questions for how genes with variable transcriptional activity driven by the cell-cycle
function within gene networks of different network topologies. Our work in determining a
quantitative gene regulatory function for the tetO promoter in Chapter 4 enables the
development and exploration of physically realistic mathematical models in which genes
exhibit stochastic transcription with cell-cycle dependent dynamics of the type studied here.
We describe some general consequences of adopting this new view of stochastic gene
expression. Straightforward computational and theoretical approaches can confirm and
expand upon these ideas.
First, we predict more ordered and correlated fluctuations. At minimum, the fact that
each cell in S/G2/M has two homologous loci contributing to a gene's expression doubles
the apparent burst frequency, dampening fluctuations. On the other hand, we note that
expression between homologous loci, and perhaps all genes, within a cell is correlated. In
this case, fluctuations would not be dampened, but entire network activity may fluctuate
together within a cell. On a single-cell basis over time, these fluctuations may be more
cyclical and ordered because they are dictated by the periodic cell cycle, rather than
independent,
stochastic
promoter
events.
However,
just
as
for fluctuations
from
transcriptional bursting, network activity occurs on the protein level so we must consider
whether cell-cycle driven fluctuations will be averaged away by delays between mRNA
export, translation, protein relocation to the nucleus and protein lifetime. Any protein-level
fluctuations are more likely for transcription factor proteins with short lifetimes, which is
often true for signaling proteins. Whichever the case, significant fluctuations at the protein
level are more likely with slower cell cycling, which could make network behavior dependent
on cell growth and metabolism. It will be interesting to explore these general effects, through
experiments measuring networked genes' mRNA and protein fluctuations in a range of
growth scenarios, and models to capture dynamics.
128
Within a positive feedback loop stochastic fluctuations have previously been predicted
and demonstrated to lead to bimodal expression distributions even without deterministic
bistability (e.g. To & Maheshri, 2010), theoretically using models assuming transcriptional
bursting and experimentally using the exact same tetO promoters studied here. Reevaluating
these results in light of this thesis work, it's first worth considering whether the pattern of
cell-cycle driven
transcription changes
underlying deterministic
stability.
Assuming
transcriptional bursting, bimodality was understood to arise from two relatively stable
states: OFF cells with low activator levels resulting in a low burst frequency and ON cells
with high activator levels resulting in a high burst frequency. Transitions could occur from
OFF to ON cells if a rare large burst resulted in enough activator molecules to drive further
expression to the ON state. Transitions could occur from ON to OFF provided activators
present in the ON state could all degrade before the next burst. Hence, if an activator's
lifetime was on a timescale similar to the burst frequency in the ON state, bimodality was
possible.
Noise-stabilized bimodality could arise from cell-cycle driven transitions with a different
pair of stable states: decay of expression during GI to an OFF state such that activation in
S/G2 is unlikely and cells may remain OFF for one or more cell cycles, and cells that
transition ON in S/G2 that quickly generate high expression causing reactivation in
subsequent cycles. Less clear is how cells transition from the ON to the OFF state, perhaps
through higher-than-average stochastic degradation in GI. But multiple operator sites'
varied expression levels could also result in stochastic fluctuations that permit a gene to
enter the OFF state. Full stochastic models, based on the model developed here for
stationary transcription dynamics at the tetO promoter but also incorporating delay in
relocation of the processed transcription factor to the nucleus, will provide further insights.
Ultimately, experimental results for the behavior of positive feedback in each cell-cycle stage
will explain how bimodal gene expression arises in light of the cell-cycle driven noise we
have measured here.
129
5.7
Future techniques to observe transcription with single-molecule
precision in real-time
Many of the challenges of this work involved inferring transcription from static snapshots
of its mRNA product. Though it is possible to see individual mRNA in real-time (reviewed
in Darzaqc et a]., 2009), it so far requires the method of engineering the sequence of the
mRNA to bind multiple MS2 bacteriophage coat proteins attached to fluorescent probes.
The method has been successful in several studies and, recently, used to quantify
transcription rates at any gene in a mammalian cell (Yunger et a]., 2013). But it requires
engineering of the gene of interest and the bulky probes may affect mRNA dynamics. A new
generation of methods is seeking to view single-molecule transcription, but so far only in
vitro. Revyakin et a]. (2012) tethered the sequence of a mammalian gene to a surface and
used advanced optics to visualize basal transcription by transcriptional machinery, purified
from cell culture. They were able to directly visualize TF assembly and individual
transcription events. Purification and addition of cofactors and mediator would enable
similar study of activated transcription. The same group also developed an interesting
method of in vitro FISH, where probes bind very quickly, because they only contain A, T
and C nucleotides and so fold poorly. So transcription processes can be measured on a subsecond temporal resolution. They used this "fastFISH" method to visualize the fast T7
bacteriophage's rate of promoter escape, elongation and termination of transcription (Zhang
et al., 2014). Given how important single-cell, single-molecule techniques have been to the
study of transcriptional dynamics so far, it seems inevitable that methods visualizing
multiple species of molecules in real-time will pave the way for a new generation of studies
of transcriptional dynamics.
130
5.8
References
Alabert, C., & Groth, A. (2012). Chromatin replication and epigenome maintenance.
Nature Reviews Molecular Cell Biology, 13(3), 153-167.
Annunziato, A. T. (2012). Assembling chromatin: The long and winding road. Biochimica
Et Biophysica Acta (BBA)-Gene Regulatory Mechanisms, 1819(3), 196-210.
Bar-Even, A., Paulsson, J., Maheshri, N., Carmi, M., O'Shea, E., Pilpel, Y., & Barkai, N.
(2006). Noise in protein expression scales with natural protein abundance. Nature
Genetics, 38(6), 636-643.
Cherry, J. M., Hong, E. L., Amundsen, C., Balakrishnan, R., Binkley, G., Chan, E. T.,
Wong, E. D. (2012). Saccharomyces genome database: The genomics resource of
budding yeast. Nucleic Acids Research, 40, D700-D705.
Darzacq, X., Yao, J., Larson, D. R., Causse, S. Z., Bosanac, L., de Turris, V., Singer, R.
H. (2009). Imaging transcription in living cells. Annual Review of Biophysics, 38(1),
173-196.
Hornung, G., Bar-Ziv, R., Rosin, D., Tokuriki, N., Tawfik, D. S., Oren, M., & Barkai, N.
(2012). Noise-mean relationship in mutated promoters. Genome Research, 22(12),
2409-2417.
Keyes, B. E., Sykes, K. D., Remington, C. E., & Burke, D. J. (2012). Sister chromatids
segregate at mitosis without mother-daughter bias in saccharomyces cerevisiae.
Genetics, 192(4), 1553-1557.
Moggs, J. G., Grandi, P., Quivy, J., J-msson, Z.,O., H-Bbscher, U., Becker, P. B., &
Almouzni, G. (2000). A CAF-1-PCNA-mediated chromatin assembly pathway
triggered by sensing DNA damage. Molecular and Cellular Biology, 20(4), 1206-1218.
Mogno, I., Vallania, F., Mitra, R. D., & Cohen, B. A. (2010). TATA is a modular
component of synthetic promoters. Genome Research, 20(10), 1391-1397.
131
Revyakin, A., Zhang, Z., Coleman, R. A., Li, Y., Inouye, C., Lucas, J. K., Tjian, R.
(2012). Transcription initiation by human RNA polymerase II visualized at singlemolecule resolution. Genes & Development, 26(15), 1691-1702.
Yunger, S., Rosenfeld, L., Garini, Y., & Shav-Tal, Y. (2013). Quantifying the
transcriptional output of single alleles in single living mammalian cells. Nat.Protocols,
8(2), 393-408.
Zhang, Z., Revyakin, A., Grimm, J. B., Lavis, L. D., Tjian, R., & Kadonaga, J. T. (2014).
Single-molecule tracking of the transcription cycle by sub-second RNA detection.
ELife Sciences, 3, e01775.
132
CHAPTER 6. Appendix
6.1
Yeast strains and plasmids
Strain and plasmid construction. All S. cerevisiae strains were constructed in a W303
background (Thomas & Rothstein, 1989) using standard methods of yeast molecular biology
(Guthrie & Fink, 2002).
Table 6-1: Yeast strains used in this study
Strain
Relevant Genotype
Y1
MA Ta trp1-1 canl-100 leu2-3,112 his 311,5 ura3 GAL+ ADE+
MA Til ade2-1 trpl-l canl-100 leu2Y6
3,112 his 3-11,5 ura3 GAL+
Y139
MA Ta his3::P,,,ato-vYFP-HIS3
MA Ta his3::P r7,,-vYFP-HIS3
Y163
0
Y236
MA Ta/17 leu2::PPGK-RFP
ADE2.PmyOz-tTA ura3/ura3::P,tetoCFP-kanR/ P WN 9o-vYFP-kanR
Y532
MATa doal::vYFP + pRS316-DOA1
Y955
MA T his3::P1
o-tdTomato-HIS3
Y1011
MA Ta/7 his3/his3.:Pt,,ao-vYFPHIS31Pixeto-tdTomato-HIS3
EY2436 pho5A::CFP-KANMX6
Parent Strain
W303
Reference
Lab collection
W303
Lab collection
Y1 +B163
Y1 +B165
Y231xY216
Lab collection
Lab collection
This study
Y1 + B858
Y2 +B229
Y139 x Y955
This study
This study
This study
W303
Gift from E. O'Shea
Table 6-2: Plasmids used in strain construction
Plasmid
B163
B165
B228
B229
B858
Base vector
pRS303
pRS303
pCM189
pRS303
pRS316
Relevant gene
Pj,,tu-vYFP
P7 ,te(rvYFP
Construction information
Lab collection
Lab collection
Lab collection
Lab collection
PCRed DOAl locus from Y1
(BamHI/NotI) and ligated into
pRS316
PADHI-tTA
Pixtet-tdTomato
PDOA-DOA1
133
6.2
Protocols:
6.2.1
Growth & arrest protocols
Yeast cultures were grown at 30 C in test tubes in synthetic minimal medium
supplemented with 2% glucose and amino acids. A fresh culture, grown for 4-6 hours off a
petri dish, was diluted to A600nm=.0003-.002 and induced with doxycycline (Sigma-Aldrich)
such that overnight culture of at least 14 hours reached A6 00 1nm=0.1-0.5. This was steady
state expression (7 or more doublings). Densities higher than Aoom,,, appeared to affect
expression. 3mL was sufficient for each mRNA FISH sample.
Cell-cycle arrest at the G1/S transition was achieved with mating pheromone alphafactor. This was added to a concentration of 3pM from 1OOX stock at 0 hours then
supplemented with half that amount (i.e. an extra 1.5pM) after 1 and 3 hours. Cells were
fixed after 3 or 5 hours of alpha-factor treatment. Most cells (>95%) showed clear signs of
alpha-factor arrest during microscopy (with the shmoo projection), and only these were
analyzed. Arrest at G2/M was achieved with microtubule-inhibiting nocodazole. This was
added at 0.015mg/mL from a 10OX stock of 1.5mg/mL in DMSO at 0 hours, then
supplemented with half that amount every 2 hours. Cells were fixed after 2, 5 or 8 hours of
nocodazole treatment. Again, most cells (>95%) were clearly arrested, with a large dumbbell
shape, and only these were analyzed. Expression was always measured on growing, G1- and
G2- arrested populations from an identical overnight culture, which was split for fixation
(growing) and arrest at t=0.
134
6.2.2
mRNA FISH
Fluorescence in situ hybridization (FISH) to count mRNA in single cells (Raj et al.
2008):
20-50
different
single-stranded
DNA
probes
to
vYFP
were
coupled
to
tetramethylrhodamine (TMR) or indodicarbocyanine (Cy5) fluorophores, and probes to
tdTomato were coupled to TMR fluorophores, as reported in To & Maheshri (2010). Yeast
were grown to early log-phase (OD, 0Onm = 0.1-0.5) then fixed, spheroplasted, hybridized and
washed similarly to (Raj & van Oudenaarden, 2008) with modifications as described in (To
& Maheshri, 2010). DNA probes at -5 pM were diluted 50-fold into hybridization solution
containing 10% formamide. The set of probes produce sufficient fluorescence to detect a
single mRNA with wide-field fluorescence microscopy. Cells were imaged on a Zeiss
AxioObserver inverted microscope equipped with a PRIOR Lumen200 mercury arc lamp, a
100X/1.40
objective
(Zeiss) and a rhodamine-
and Cy5-specific filter set
(Chroma
Technology Cat. No. 31000v2 and 41024 respectively). For each sample, eight Z-stack images
0.3 microns apart were obtained.
6.2.3
mRNA FISH image analysis
Z-stack images were analyzed using custom software written in MATLAB based on that
used in (To & Maheshri, 2010). The algorithm used to identify spots corresponding to single
mRNA applies region-based thresholding and identifies local maxima as spots. Three
parameters used by the algorithm can change due to day-to-day variation in staining: (1)
the minimum intensity for a pixel to be considered as part of a spot, manually set by
examining several z-stacks, picking a threshold that identifies spots and not background,
and verified by insuring a false positive rate of < 4% in negative control samples; (2) the
average intensity of a single mRNA spot, chosen using the mode of spot intensities for lower
expressing samples (Figure 6-1A), to allow counting of multiple overlapping mRNA; and (3)
the threshold intensity at which a spot is classified as a site of nascent transcription, chosen
as the transition between the peaked and flat sections of a histogram of spot intensities (-510 fold higher than the intensity of a single spot - Figure 6-1A). Mean protein levels in
different samples were used as an internal control, and we always verified the ratio of mean
135
protein level to mean mRNA count was consistent across samples and expression levels.
Figure 6-lB is an example of a processed image showing mRNA spots.
A
0.15
z
0.1
E
S0.05
LL
j.fl~A~
0
0
0.5
I
1.5
Spot intensity
2
X 10
5
B
A
AAIL
B
C
JA92
-x
WU
D
+.:td E
F
W_A
Figure6-1: (A) Histogram of the mean pixel intensity of spots detected as mRNA in a population
of cells with green, blue and red lines showing the parameters selected to analyze the spots. The
threshold (green line) is the minimurn pixel intensity for a pixel to be considered as a spot, selected
to keep false positives < 4% in a negative control sample without the fluorescent reporter and be
consistent with manual vlsual inspection of a subset of images. Alultiple mIRNAs that overlap in the
z-projection appear as a brighter spots. The mode of the histogram (blue line) is selected as the
intensity of a single mRAVA and spots that are >= 2-fold brighter are counted as multiple spots bzy
normalizing with the mode threshold. Very bright spots (>4-fold brighter than a single mRNA spot
in the flat region of the histogram - red line) are thought to be formed by sites ofnascent transcription
if they align with the nucleus and are automatically identified as those with intensities in the flat
region of the histogram (red line). (B) Images of cells with mRNA counts measured by mRNA FISH.
(top) Bright-field overlaid with blue DAPI-stainednucleus and (middle) the maximum projection of
e4ght images fluorescent rhodamine staining within a Z-stack. (bottom) mRNA and nascent
transcriptionsites identified by the spot-counting algorithm are marked with a red or magenta dot
respectively. Nascent spots align with the blue DAPI-stainednucleus.
136
6.2.4
Numerical solutions to stochastic models
The two-state transcriptional bursting model has an analysis solution at steady state;
but more complex schemes, including cell-cycle dependent transcription, do not. Numerical
methods enable simulation of more complicated transcription patterns to predict expression
distributions. The Gillespie algorithm (Gillespie, 1977) is a kinetic Monte Carlo simulation
of stochastic chemical equations. It is simple to implement for any reaction scheme and
statistically correct but requires large sampling and thus can be computationally expensive.
It allows for simultaneous tracking of multiple state variables. The Finite Markov Chain
method is also amenable to almost any reaction scheme and gives a solution for the
distribution of state variables over transient or steady-state dynamics. For steady-state
dynamics, the principal eigenvector of the transition matrix is the steady-state distribution.
It is an exact solution provided that the simulation models a sufficiently large state space
(Munsky & Khammash, 2008). But it becomes memory limited and intractable for large
state spaces, making the Gillespie method more useful for simulating multiple species/state
variables.
137
Quantification of mRNA dynamics
6.3
6.3.1
Nuclear and cytoplasmic mRNA degradation half-life
mRNA dynamics are necessary to infer transcription dynamics from mRNA expression.
The kinetics of mRNA degradation were measured by thiolutin treatment, which is a potent
inhibitor of bacterial and yeast RNA polymerases, and thus ceases all transcription. mRNA
expression following thiolutin treatment represents mRNA decay. We simplify mRNA
degradation as a first-order process (Equation 6-1) with half-life of 20 minutes. This agrees
well for 80% of mRNA degradation over 60 minutes (Figure 6-2A,B black). But degradation
is long-tailed, meaning that some portion of mRNA degrade much more slowly. This is
evident in mRNA levels higher than predicted by first-order decay at long-timepoints. This
trend is better represented by two first-order decay processes (Equation 6-2) with half-lives
of 10 minutes and 30-40 minutes (Figure 6-2A, B grey). This uncertainty in mRNA
degradation causes uncertainty in transcriptional dynamics, particularly in G1 where mRNA
decay is a dominant process.
dM
6-1
I
-
-
dt
6-2
dt
A
z
100%
75%
=-1
tDeg
M
tDeg 2
t
M=MoetDe
-
M
M=Me
M
I M(_
tDeg 2
Sytoplasmic mRNA
B
100%
tDeg = 22, R2=929
tDegl= 92
tDeg2
75%
40 R29%t
50%
25%
25%
0%
0
g
Nuclear mRNA
tDeg = 20, R2=97
tDeg
jtDeg2
e 50%
E
t
21
= 8-8
30 R2=999
b
0%
60
30
0
90
60
30
Time after thiolutin treatment (minutes)
90
Fig-re 6-2: Measurement of inRNA degradation dynamics. Cells were grown to steady-state
expression and then treated with thiolutin. which ceases trans(iption. (A) Cytoplasmic mRNA
degriades by first-order exponential decay with a half-life of 20 minutes over 60 minutes. But
degradation beyond 60 minutes is long-taied better reflected by two exponential decay process with
half-lives of 10 minutes and 40 minutes. (B) Nuclear mRNA decays with similar kinetics to
:Vtoplasnic. Rates are slightly faster, perhaps because nuclear nRNA is not detected below 2-3
nRNA.
138
6.3.2
Rate of nuclear mRNA export
To calculate the rate of nuclear mRNA export, we consider nuclear and cytoplasmic
mRNA expression levels in cells active under arrest. Assuming these cells are at steady-state
(supported by our measurements of arrested expression over time), the export rate is related
to the known cytoplasmic mRNA degradation rate by the ratio of nuclear and cytoplasmic
mRNA:
63dM
6-3
Nk=0M
_
=- kExportN - kDegradationM
0 @SS
N
kExport
kDegradation
Where M is mean cytoplasmic mRNA and N is mean nuclear mRNA for a given sample.
Across all samples, this ratio is 2 (Figure 6-3). Thus we infer a first-order export rate of
10/minute.
60
.2 40
0 20
.c
0
--
M/N=2
30
20
10
Median nuclear mRNA
Figure 6-3: In arrested, steacy-state expression. ON cells' cytoplasmic mnRNA levels are 2-fold
ON cells' nuclear inRNA levels across all samlples, indicating nuclear mRNA export has twice the
rate of cytoplasmic inRAA degradation.
139
6.4
mRNA FISH error analysis
Error analysis for mRNA FISH results is not shown on the figures in Chapter 4 for
clarity. (Error is particularly difficult to report when the experimental results are
distributions themselves.) Here, we provide the range of the number of replicates for each
data point (N=1-5, Table 6-3) and the range of the number of cells sampled for each replicate
(N= 186 +/ - 95 cells on average +/ - standard deviation, Table 6-4). This gave experimental
error in mRNA count (Table 6-5, 3mRNA on average) and in the fraction of active cells
(Table 6-6, 6% on average). Most (70-95%) error derived from variation between replicates,
rather than error from sampling individual cells (as determined by bootstrapping). The level
of error is illustrated in Figure 6-4 & Figure 6-5, which correspond to figures from Chapter
4. They show error between the mean of each replicate is low and unlikely to interfere with
results. Error from sampling within each replicate, shown as error bars on each point, is
even lower.
Table 6-3: Na ber of replicatesfor each data point.
1xtetO
0
1
%
13%
22%
33%
50%
77%
95%
Table
Growing
2
4
2
2
1
1
3
1
5
G1 arrest
1
1
1
2
1
1
2
1
3
G2 arrest
1
1
1
2
1
1
2
1
3
Growing
1
2
3
3
2
2
2
2
5
6-4: A verage number of cells sampled in each replicate.
ixtetO
0%
1%
5%
13%7
22%c
337
50%
77%
957o
Growing
230
260
180
210
470
560
320
140
160
G1 arrest
110
190
150
250
180
130
170
200
270
G2 arrest
130
190
140
120
100
70
80
70
100
140
Growing
170
130
260
260
280
270
350
230
140
7xtetO
GI arrest
1
1
3
3
2
2
2
2
3
G2 arrest
1
1
4
4
2
2
2
2
4
7xtetO
G1 arrest
260
130
270
150
130
200
160
180
230
G2 arrest
250
150
80
90
110
110
110
100
90
Table 6-5: Total experimental error in mean mRAA count from error between replicates and
from sampling within a replicate. Error is reported for data point with > 1 replicate.
IxtetO
7xtetO
0%
1%
5%
13%
22%
33%
50%
77%
957
Growing
1.3
0.7
1.8
1.7
G1 arrest
G2 arrest
0.2
5.4
1.8
0.7
3.3
3.1
1.9
3.4
Growing
G1 arrest
G2 arrest
0.7
1.4
1.6
0.6
1.8
3.3
5.0
Average:
4.0
6.7
1.8
12.2
5.1
3.0
8.7
3 mRNA
0.2
5.1
3.0
1.1
3.9
2.3
3.4
3.5
Table 6-6: Total experimental error infraction of active cells, from error between replicates and
from sampling within a replicate.
0%
1%
5%
13%
22%
33%
50o
77%
95%
A
0
2
1xtetO
GI arrest
Growing
4%7
5%
10%
7%
1%
2
8%
1%
23%
4%
4%
3%
2o
2%
5%
8%
14%
4.
DOA1
c
4%
6%
3b
B 0
B0
-'. u.-
7xtetO
GI arrest
Growing
.
2
94
G2 arrest
7xtetO
0%
5%
4
8%
3%
7%
8%
Average:
C 0
95%.
D2
G2 arrest
13 %
11%
3%
4%
2o
5%
2%
6%F
TK
2
7xtetO
13%
2
?-4
-
6
8
twe10
"
8
.
b 10
0
10
20
30
Cytoplasmic mRNA per gene copy
0
20
40
60
Cytoplasmic mRNA per gene copy
>
D 0,
G2, Mid
G2, Late
G1
G2 arrest
GI arrest
-
b 10
00
10
Cytoplasmic
-
mRNA
20
30
per gene copy
E o-
-
2
22
4
4
~=~-
3%meto
13%
6
2
8
8
+4-
be 10L
0
0.5
Fraction of active cells
1
b 10 _
S0
0.5
Fraction of active cells
I
Figure 6-4: Experimental eiror in data points from Figure 4-2 (A, B, C) Error in mean mRNA
counts: Each replicatejs shown as a point and error bars represent standard deviation of sampling
error determined by bootstrapping. (D. E) Errorin fraction of active cell: Each replicate is shown
in overlapping bars. with error bars representing standard deviation from sampling error. Erroris
low enough that it does not interlre with interpretation of results.
141
*
25%
.
IxtetO
*
*
**
G2 arrest- - 25%
-.
7xtetO
arrest
G2 arrest
.9.*
G arrest
-*
50%
G1
-*-
**
*
--
..V
) 50%
r
04
75%
9. -
-
75%
-
40
60
0
20
Cytoplasmic mRNA per gene copy
0
10
20
30
Cytoplasmic mRNA per gene copy
C
0%
'_____
IxtetO
_
-_-
G1 arrest
25%
25%
G2 arrest-4-
Z050%
7xtetO
G1 arrest
2.50%
G2~arr-4-
*
75%
0
0-5
Fraction of active cells
E 0%
-
*
25%
..Z50%
75%
75%
10
20
3C
Cytoplasymic mRNA per gene cop3
0
-
0%
in ON cells
*-*-
25%
1xtetO
G1 arrest
G2 arrest"
H 0%
-
*
9
GI arrest
G2 arrest
-9.-..
60
20
40
0
Cytoplasrmic mRNA per gene copy
in ON cells
*
-
-.
1
7xtetO
-+
S--
) 50%
G
0-5
Fraction of active cells
IxtetO
GI arrest
G2 arrest-
-~4 25%
II
1
25%
*7xtetO
9-*
G2 arrest
-~~ *-9---
5.
.6050%
050%
75%
75%
*
0
Nuclear
-9
5
10
0
15
30
20
10
Nuclear mRNA per gene copy in
ON cells
mRNA per gene copy in
ON cells
Figure 6-5: Experimental error in data J)oints from Figure 4-3. (A, B. E-H) Errorin mean mRNVA
coiuits: Each replicate is shown as a point and error bars represent standarddeviation of sampling
error determined by bootstrapping. (C, D) Error in fraction of active cells: Each replicate is shown
in overlapping bars, with error bars representingstandarddeviation from sampling error. Error is
reliably low and does not interiere with interl)retationol results.
142
6.5
References
Gillespie, D. T. (1977). Exact stochastic simulation of coupled chemical reactions. The
Journal of Physical Chemistry, 81(25), 2340-2361.
Guthrie, C., & Fink, G. R. (2002). Guide to yeast genetics and molecular and cell biology:
Part C Gulf Professional Publishing.
Munsky, B., & Khammash, M. (2008). The finite state projection approach for the
analysis of stochastic noise in gene networks. Automatic Control, IEEE Transactions
on, 53(Special Issue), 201-214.
Raj, A., & van Oudenaarden, A. (2008). Nature, nurture, or chance: Stochastic gene
expression and its consequences. Cell, 135(2), 216-226.
Thomas, B. J., & Rothstein, R. (1989). Elevated recombination rates in transcriptionally
active DNA. Cell, 56(4), 619-630.
To, T., & Maheshri, N. (2010). Noise can induce bimodality in positive transcriptional
feedback loops without bistability. Science, 327(5969), 1142-1145.
143
Download