Analysis of Metazoan DNA Replication Initiation... Drosophila Gene Amplification as a Model ...

advertisement
Analysis of Metazoan DNA Replication Initiation using
Drosophila Gene Amplification as a Model System
by
Jane Christina Kim
B.S. Molecular, Cellular, and Developmental Biology, 2004
Yale University
New Haven, CT
Submitted to the Department of Biology
in Partial Fulfillment of the Requirements for the Degree of
Doctor of Philosophy in Biology
ARCHVEs
MASSACHUSETS INSTITUE
at the
OF TECHNOLOGY
Massachusetts Institute of Technology
Cambridge, MA
NOV 16 2010
LIBRARIES
February 2011
@ 2010 Jane Christina Kim. All rights reserved.
The author hereby grants to MIT permission to reproduce or distribute
publicly paper and electronic copies of this thesis document in whole or in part.
Signature of Author
...........................................
Department of Biology
November 8, 2010
Certified by
...........................
Terry L. Orr-Weaver
Professor of Biology
Thesis Supervisor
Accepted by
...........................
Stephen P. Bell
Chair, Committee on Graduate Students
Department of Biology
Analysis of Metazoan DNA Replication Initiation using
Drosophila Gene Amplification as a Model System
by
Jane Christina Kim
Submitted to the Department of Biology on November 8, 2010
in Partial Fulfillment of the Requirements for the Degree of
Doctor of Philosophy in Biology
ABSTRACT
Gene amplification in Drosophila follicle cells is an excellent model to study origin
specification and developmental regulation of DNA replication in vivo. We mapped all follicle
cell amplicons using a comparative genomic hybridization strategy and identified two new
amplicons. We determined the precise localization of the origin recognition complex (ORC) on a
genome-wide level and observed that, at the start of synchronous amplification, ORC localizes to
the six amplicons with levels corresponding to the magnitude of amplification. Additionally, we
investigated amplification with respect to transcription and chromatin state. The levels and
timing of gene expression in some amplicons suggest that gene amplification is not exclusively a
developmental strategy to promote high expression levels. Follicle cell amplicons are enriched
for acetylated H4, but this mark is not sufficient for ORC localization or amplification. In
addition to genome-wide analyses, we characterized the two new amplicons and discovered
unique properties that make both distinctive replication models. Strikingly, DAFC-22B shows
strain-specificity in amplification, a property that is correlated with the ability to localize ORC.
We identified sequence differences between closely related amplifying and non-amplifying
strains and used P element mediated transformation to test sufficiency for ORC binding and
amplification at this region. DAFC-34B contains two genes that are expressed in follicle cells.
Vm34Ca is a structural component of the vitelline membrane but is expressed prior to the onset
of gene amplification. CG16956 is expressed in amplification stages but only in a small subset of
follicle cells. Like the previously characterized DAFC-62D, DAFC-34B displays origin firing at
two separate stages of development. However, unlike DAFC-62D, amplification at the later stage
is not transcription dependent. We mapped the DAFC-34B amplification origin to 1kb by nascent
strand analysis and delineated the cis requirements for origin activity, finding that a 6 kb region,
but not the 1 kb origin alone, is sufficient for amplification. We analyzed the developmental
localization of ORC and the MCM complex, the replicative helicase. Intriguingly, the final round
of origin activation at DAFC-34B occurs in the absence of detectable ORC, though MCMs are
present, suggesting a novel initiation mechanism. Our analysis of follicle cell amplicons
highlights the diversity of amplification origin control mechanisms within the same cell type,
which may be representative of similar regulatory diversity during S phase DNA replication.
Thesis Supervisor: Terry L. Orr-Weaver
Title: Professor of Biology
Dedicated,with love and thankfulness,
to my parents
Sung and Kathy Kim
Acknowledgements
I rotated in and joined Terry's lab in the spring of 2005 when she was away on sabbatical. It was
a bold decision but one that I would make again in a heartbeat because of the kind of advisor
Terry is. Terry is a brilliant scientist who has made many seminal contributions to biology.
Additonally, she has poured so much of herself into the mentorship of her trainees that, even in
her absence my first year (though with frequent phone meetings and regular visits), I experienced
top-notch scientific guidance and overwhelmingly supportive collegiality through being a
member of the Orr-Weaver lab. Since then, I have only had more reasons to appreciate Terry: her
genuine passion for science and the fairness and thoughtfulness with which she manages her lab.
I know I have grown in tremendous ways scientifically, professionally, and personally because
of Terry's mentorship, and for this I am deeply thankful.
I want to thank all of the members of the Orr-Weaver lab who I had the privilege of working
with. They have been and continue to be wonderfully supportive and fun colleagues. Their
names are too numerous to list here, but each has made lasting contributions on my scientific and
personal wellbeing. I will miss being in the TOW-ZONE very much.
Steve Bell and Peter Reddien have been on my thesis committee for the past five years. They
have provided valuable scientific input and practical guidance that has helped move my project
forward every year. Since the beginning, I have felt that they were rooting for me to do well, and
this implicit encouragement has been immensely motivating. Graham Walker is newly on my
thesis committee this year, but I have admired the work of his education group since the
beginning of grad school. His commitment to training scientist-educators and developing
effective biology curriculum has inspired me to believe that this endeavor is a worthy one. Jeff
Kapler was on sabbatical in Terry's lab during the 2008-2009 academic year from Texas A&M.
He provided a lot of good advice about molecular biology experiments and how to be a wellrounded scientist. He also gave me the single best piece of advice I have ever received and
reflect on it often.
Despite having a wonderful advisor and supportive lab mates, grad school has been challenging,
though in the words of Randy Pausch, "Brick walls are there to show how badly we want
something." My cherished friends and family have continuously kept my perspective in check
and my mood cheerful. I want to thank Wendy Lam, Michelle Sander, Sudeep Agarwala, and
Renuka Pandya for their friendship, which I treasure greatly. The friends I have made through
MIT and MIT Biograds, Sidney Pacific, and Highrock are, blessedly, too numerous to list here,
but I am grateful for them. My high school A4L, Yale girlfriends, and extended relatives have
treated me like a (scientific) rock star, and their love and enthusiastic support has stood the test
of time (and my scientific hermit phases). This most definitely includes thanks to my brother
John for being everything a girl could ask for in an older brother.
Finally, I would like to thank my parents. They gave life to me, but through their unwavering
support and love they gave me my life. I know the immigrant story of leaving everything familiar
behind to pursue opportunity for one's children is not uniquely my family's, but I am profoundly
thankful to be the outcome of this one story. This thesis is a testament to my parents' sacrifice,
support, and love. Thank you, Mom and Dad.
TABLE OF CONTENTS
Chapter One:
Introduction
Eukaryotic replication initiation proteins and cell cycle regulation
Replication origin discovery
Models from budding yeast and fission yeast
Identification of individual metazoan replication origins
Methods of replication origin discovery on a genome-wide scale
Developmental gene amplification as an origin discovery tool
Properties of metazoan replication origins
Sequence properties and genome distribution of replication origins
Replication and transcription
Replication and chromatin context
Replication timing and cell type specific replication programs
The importance of cataloging replication origins
Summary of thesis
References
7
8
I1
11
14
22
25
29
29
30
35
37
39
41
42
Chapter Two:
Genome-wide identification of Drosophila follicle cell amplicons as
in vivo model replicons
Abstract
Introduction
Results
Identification of two new follicle cell amplicons by aCGH
Genome-wide expression analysis of follicle cells
ORC binding in amplicons localizes to the most amplified region
H4 acetylation corresponds to the magnitude of amplification
DAFC-22B exhibits strain-specific amplification
Mapping cis elements responsible for DAFC-22B amplification
Discussion
Materials and Methods
Acknowledgements
References
47
48
49
51
51
52
62
70
76
86
86
93
96
96
Chapter Three:
Differential ORC localization during two rounds of replication initiation
at a Drosophila follicle cell amplicon
Abstract
Introduction
Results
Two genes in DAFC-34B are expressed in follicle cells
DAFC-34B shows two distinct stages of replication initiation
DAFC-34B amplification origin corresponds to the Vm34Ca
transcription unit
99
100
101
104
104
107
113
Developmental control of ORC and MCM localization at DAFC-34B
orc mutant blocks both rounds of replication initiation at DAFC-34B
Delineation of cis control elements for replication at DAFC-34B
Discussion
Materials and Methods
Acknowledgements
References
Chapter Four:
Conclusions and Perspectives
Active transcription as a causal determinant of gene amplification
CG7337 expression and strain-specific amplification at DAFC-22B
Specific histone acetylation and gene amplification
Investigating ORC independent initiation at DAFC-34B
Application of Drosophila genomic resources to the study
of follicle cell gene amplification
References
116
122
125
128
132
135
135
138
139
142
143
144
144
146
Appendix One:
Strategy and preliminary results toward generation of conditional replication
factor mutants in Drosophila
Introduction
Results
Discussion
References
147
148
148
150
152
Appendix Two:
Synteny analysis of the DAFC-30B amplified region
Results
Acknowledgements
References
157
158
159
159
Appendix Three:
Summary of follicle cell amplicons
162
Chapter One:
Introduction
Complete duplication of the genome during S phase is critical for the accurate
transmission of genetic material to daughter cells during the cell cycle. Two fundamental
questions related to the regulation of this genome duplication process are: (1) how are individual
DNA regions selected to function as replication origins, or sites from which replication initiation
occurs; and (2) how is the activation of replication origins coordinated genome-wide such that
every sequence is replicated once, and only once, per cell cycle? The identification of replication
origins to uncover functional properties of these genomic regions and analyze their regulatory
mechanisms is key to understanding these questions. However, this task has been challenging in
metazoan systems because until recently, a small number of replication origins had been
identified for molecular characterization.
This chapter focuses on the identification of metazoan replication origins and the insights
into origin function that their analyses have provided: both insights from studies of individual
replication origins as well as recent genome-wide studies to map replication origins using
microarray and next-generation sequencing methods. The emerging picture of metazoan
replication origins is one where, despite the absence of a sequence-specific motif for origin
specification, cis-acting information does influence the binding of replication proteins as well as
origin function. Furthermore, whether or not DNA replication initiates from a given genomic
locus is a highly dynamic process subject to influence from transcriptional activity and
chromatin state.
Eukaryotic replication initiation proteins and cell cycle regulation
In 1963, Jacob et al proposed the "replicon model" to explain the regulation of DNA
synthesis in bacteria (JACOB and BRENNER 1963). According to this model, an initiator protein
that was encoded by a structural gene would interact with a sequence-encoded genetic element
called the replicator to initiate DNA replication. Identification of the origin recognition complex,
or ORC, nearly 30 years later demonstrated that a eukaryotic initiator existed (BELL and
STILLMAN 1992). However, the identification of origins in eukaryotes reveals that there is a
greater diversity of origin structure than the protein complexes that promote origin activation.
ORC marks all potential sites of origin activation in eukaryotes (BELL and DUTTA 2002).
This hexameric complex binds DNA and recruits Cdc6 and Cdtl. The subsequent ATP
hydrolysis activities of ORC and Cdc6 result in the stable loading of the MCM2-7 replicative
helicase to DNA and poise the site for unwinding of the double-stranded helix. Together, these
proteins comprise the pre-replicative complex (pre-RC), and formation of these complexes
establishes origin licensing, or competence for DNA synthesis. Though identified primarily in
yeast, the components of the pre-RC have been identified in all eukaryotes examined. The preRC is activated to form the pre-initiation complex (pre-IC), whose components in yeast include
Cdc45, GINS, Sld2, Sld3, Dpbl 1, and Mcm10. The mechanism of this activation process is an
area of active investigation, but a key event is the Dbf4-dependent kinase, or DDK, dependent
phosphorylation of MCM subunits (WALTER and ARAKI 2006). Additionally, CDK-dependent
phosphorylation of Sld2 and Sld3 leads to activation of the pre-IC and recruitment of DNA
polymerases to initiate DNA synthesis (TANAKA et al. 2007; ZEGERMAN and DIFFLEY 2007).
Activation of the pre-RC (and the initiation of DNA synthesis) is often referred to as origin firing
and will be used interchangeably with origin activation in this text (see Table 1-1 for replication
terminology).
To ensure that origin activation occurs once and only once per cell cycle, the process of
pre-RC formation is tightly regulated. Across diverse eukaryotic organisms, a general strategy is
to separate the origin licensing and origin activation stages in the cell cycle. Origin licensing
Table 1-1. Replication terminology.
Term
Definition
Reference
Initiator
Protein or protein complex that binds replicators and
are required for replication initiation (term first
proposed in replicon model)
(JACOB and
BRENNER 1963)
Initiation site
(replication origin)
Location on DNA from which replication forks
emanate
Origin activation
(origin firing)
Activation of the pre-RC and the initiation of DNA
synthesis
Origin efficiency
Percentage of cells that activate a particular
replication origin every cell cycle
Replicator
Genetic element that is required for replication
initiation from a particular chromosomal location
(term first proposed in replicon model)
Replicon
Region of DNA that is duplicated by a replication
origin
Spatial replication
program
The physical distribution of all replication origins
used to replicate the genome
Temporal replication
program
The timing with which all replication origins are
activated to replicate the genome
(JACOB and
BRENNER 1963)
occurs during GI phase when CDK activity is low, whereas origin activation occurs when CDK
activity is high. Additionally, cells use redundant mechanisms to prevent pre-RC assembly
outside of G1 including targeted protein synthesis of pre-RC components during G1 and
inactivation of these proteins following replication initiation via protein degradation and nuclear
export (ARIAs and WALTER 2007). Although the specific combination of strategies differs
depending on the organism, one mechanism common to metazoans is the activity of the protein
Geminin (BELL and DUTTA 2002; MELIXETIAN and HELIN 2004). Geminin binds to Cdtl and
inhibits its function, thus adding another layer of regulation to prevent origin licensing outside of
G1 phase. Overexpression of Cdtl has been shown to cause re-replication in multiple cell
systems (THOMER et al. 2004; YANOW et al. 2001; ZHONG et al. 2003), thus highlighting the
importance of limiting Cdtl activity to prevent re-replication and maintain genomic stability.
Replication origin discovery
Models from budding yeast andfission yeast
To identify cis-acting origins of replication in yeast, a strategy from bacteria was adapted
to isolate sequences that would support autonomous replication in a host cell. The autonomously
replicating sequence (ARS) assay was successfully implemented in budding yeast and fission
yeast to identify sequences that conferred origin function both in the context of a plasmid as well
as the native chromosomal locus (ALADJEM et al. 2006; STINCHCOMB et al. 1980). In budding
yeast, these sequences are approximately 125 base pairs in length. Compilation of these
sequences led to the identification of an 1lbp ARS consensus sequence (ACS) that is required,
though not sufficient, for ORC binding. Molecular dissection of a number of budding yeast
origins reveals a modular structure, with different conserved elements that stabilize ORC binding
and MCM recruitment (LIPFORD and BELL 2001; RAo and STILLMAN 1995) (Figure 1-lA).
Figure 1-1. Examples of replication origins from diverse eukaryotes.
(A) Budding yeast ARS] is a well-defined replication origin. ORC binds to the 11 bp ACS
(and the B 1 element). The origin of bidirectional replication has been precisely identified
by replication initiation point (RIP) mapping immediately downstream of the ORC
binding site. Collectively, the B elements are essential for origin activity, but their
number, type, and position are variable among budding yeast origins.
(B) Fission yeast ars2004 is approximately 1 kb in length and contains three asymmetric ATrich sequences (blue boxes) that are required for origin activity.
(C) The DHFR locus in Chinese hamster ovary cells is an example of a broad initiation zone.
Replication initiates in the 55 kb intergenic regions between the DHFR and 2BE212
genes. Replication initiation occurs most frequently from the three sites designated with
triangles.
(D) The human p-globin locus is an example of a confined replication site. Replication
initiates efficiently from a region <10 kb spanning the p-globin gene.
(E) The chorion amplicon DAFC-66D is an example of a confined replication site. The
amplification enhancer A CE3 and primary origin Orip3 are both necessary and sufficient
for amplification.
....
..
....
.........................................
.
.
...
..............
. ............................................
-...
........
:1 :::::,
Figure 1-1
A. Budding yeast
ARS1
ORC
1Obp
B
Initiation
B. Fission yeast
ars2004
100bp
Initiation
ORC
ORC
C. Chinese hamster
DHFR locus
10kb
DHFR
A&2BE2121
Y
0'
-
w w
w
MW" w
4w 4w
Mw
Initiation
D. Human
-globin locus
10kb
Initiation
E. Fruitfly
Chorion amplicon DAFC-66D
1kb
1ACE3cp 18
PRM
pc19pI
Snw
Orio Initiation (80%)
In fission yeast, the ARS assay identified origin sequences approximately 1 kb in length.
Although a consensus sequence could not be computationally extracted, fission yeast origins
were found to display an extended asymmetric AT-rich sequence, and several discrete regions
within individual origins have been shown to influence ORC binding and origin activation. For
example, ars2004 contains three different required regions, but they can also be replaced with 40
base pair poly(dA/dT) fragments to restore origin activity (OKUNo et al. 1997) (Figure 1-1B).
Identification of individual metazoan replication origins
When applied to metazoans systems, the ARS assay was unsuccessful in systematically
identifying specific replication origins (GILBERT and COHEN 1989; MASUKATA et al. 1993). It
appeared that many sequences could support autonomous replication in this assay (KRYSAN and
CALOS 1991). Additional experiments determined that circular DNA could bind ORC and
replicate in mammalian cell culture without any sequence specificity (KRYSAN et al. 1993;
SCHAARSCHMIDT et al. 2004). These results were consistent with experiments using Xenopus
eggs, where any injected sequence could be fully replicated (HARLAND and LASKEY 1980).
Although these results discredited the existence of a universal sequence-specific replicator in
metazoans, investigators acknowledged that because the early embryo displayed a very high
density of origin firing to accommodate the rapid cell division cycles, this could be a specialized
case where specific origins are not used.
The discovery of site-specific replication initiation in mammalian cells (described below)
demonstrated that non-random origin selection could be utilized during DNA replication. The
identification of additional metazoan origins to survey the extent of site-specific initiation would
rely on developing experimental methods to look for replication origins in well-delineated
genomic regions. Approximately 20 metazoan replication origins have been identified and
studied using multiple methods (ALADJEM et al. 2006). In the next section, two select examples
are described to illustrate the principles of origin discovery and analysis. In addition, they
represent two classes of replication origins: broad replication zones that contain multiple
infrequent initiation sites and confined replication sites from which initiation events occur at a
high frequency.
The first mammalian origin identified was the dihydrofolate reductase (DHFR) locus in
Chinese hamster ovary (CHO) cells (HEINTZ and HAMLIN 1982) (Figure 1-IC). Its discovery and
subsequent analysis was facilitated by several experimentally advantageous characteristics of the
cell line. This genomic region becomes amplified in the presence of increasing concentrations of
the competitive inhibitor methotrexate, allowing cell lines containing an approximate 800-fold
amplification of the 240 kb region encompassing the DHFR gene to be isolated (MILBRANDT et
al. 1981). The high copy number of the amplified sequence could be visualized as distinct
restriction fragments against the background of single copy genomic DNA on an agarose gel.
This property, combined with the ability to arrest these cells in a GO state and track resumption
of the cell cycle through the Gl/S boundary in the presence of radiolabeled nucleotide, meant
that the earliest-replicating restriction fragments could be visualized and positionally mapped
(HEINTZ and HAMLIN 1982).
The discovery of the DHFR locus replication origin is representative of a general strategy
to identify newly replicated DNA molecules by their increased abundance compared to nonreplicated DNA. Because many researchers used this locus to study gene regulation, gene and
restriction maps of the region were available. Since its discovery the DHFR locus has been the
subject of many studies aimed at finely mapping the position of the multiple origins in this
region. The storied history and often-conflicting results related to this locus are described in a
recent review (HAMLIN et al. 2010). The current understanding is that the DHFR locus is a broad
replication initiation zone where replication initiates infrequently from one of several origins.
The identification of the human p-globin locus origin also was facilitated by being a wellstudied gene region (Figure 1-ID). Newly synthesized leading strand DNA was isolated by
pulse-labeling cells with the nucleotide analog bromodeoxyuridine (BrdU) in the presence of
emetine, which inhibits lagging strand synthesis, and then recovering single-stranded DNA
fractions that were enriched for BrdU incorporation (KITSBERG et al. 1993). These samples were
slotted onto filters and hybridized with strand specific probes across a 200 kb region. Replication
direction could be inferred by determining whether hybridization was more abundant using the
plus or minus template probe. Likewise, the replication origin could be identified as the site of
tail-to-tail DNA synthesis or the junction at which leading strand DNA was on opposite
templates. This example is representative of a general strategy to identify origins by determining
the transition point at which leading and lagging strands switch templates (HAMLIN et al. 2010)
(Figure 1-2C). The spacing of the probes allowed the human P-globin origin to be positioned to
within a 10 kb, well-defined fragment. Thus, the human p-globin locus is an example of a
confined replication site from which replication initiation occurs from a predominant site.
The origin sequence from the human
p-globin locus could be moved to an ectopic
site
and still confer origin activity, the first such demonstration in a mammalian system (ALADJEM et
al. 1998). The initiation activity at the human p-globin locus requires a region 40 kb upstream
called the locus control region (LCR). Although origin function requires the LCR at the native
locus, initiation could be observed at ectopic sites without the LCR. This result suggests that
distal cis regulatory sequences can influence replication initiation, but their functions can be
Figure 1-2. DNA structures at a replication origin.
(A) ORC marks all potential sites of origin activation in eukaryotes. ORC-bound sites can be
identified by chromatin immunoprecipitation followed by competitive or quantitative
PCR (ChIP-qPCR), hybridization to a microarray (ChIP-chip), or high-throughput
sequencing (ChIP-seq).
(B) Replication bubbles are found at origins due to the melting of double stranded DNA and
initiation of DNA synthesis. Replication bubbles can be identified by 2-D gel mapping
techniques or the "bubble-trap method" (see text for details).
(C) Replication origins correspond to the junction where leading and lagging strands switch
templates. Initial identification of the replication origin at the human p-globin locus
relied on this property.
(D) Newly replicated DNA can be isolated by pulse labeling cells with the nucleotide analog
BrdU, followed by anti-BrdU immunoprecipitation. BrdU-IP DNA can be identified by
competitive or quantitative PCR, hybridization to a microarray, or high-throughput
sequencing (e.g. Repli-seq).
(E) Short nascent strands can be isolated by size fractionation on a sucrose gradient or low
melting agarose gel. An additional -exonuclease treatment step removes nicked DNA
that lacks a 5' RNA primer and is unprotected from digestion. Short nascent strands can
be identified by competitive or quantitative PCR, hybridization to a microarray, or highthroughput sequencing (the last method has not been reported but is technically feasible.
..........
.
...
.....
...................
I-..............................................
............
. ...........
.............
........................................................
.........
.............
Figure 1-2
A.
ORCbound DNA
B.
Replication bubbles
ORC
C. Junction where leading
and lagging strands
switch templates
anti-BrdU
immunoprecipitated DNA
E. Short nascent strands
(resistant to X-exonuclease
digestion)
RNA primer
BrdU
Leading strand
Lagging strand
substituted, presumably with sequences that display similar properties, the nature of which are
not well-understood.
A powerful approach developed to analyze replication origins on an individual level was
two-dimensional (2-D) mapping (FANGMAN and BREWER 1991). Though developed to study
ARS function in budding yeast, these methods were adapted to study replication origins in
metazoans. Neutral-alkaline 2-D gels are used to map origins based on nascent strand sizes
(HUBERMAN et al. 1987) (Figure 1-3A) whereas neutral-neutral 2-D gels map origins by branch
topology (BREWER and FANGMAN 1987) (Figure 1-3B). An additional advantage of these
methods is that origin efficiency, the percentage of cells within a population that activates
initiation from a particular site, can be roughly determined. For example, depending on the
intensity of the bubble arc compared to the Y arc in a neutral-neutral 2-D gel, one can infer the
proportion of cells in which a particular origin is active (Figure 1-3B). Using 2-D analysis, the
discrete origins in the DHFR locus were found to fire with low efficiency (MESNER et al. 2003).
2-D gel mapping techniques have been considered the gold standard in precisely mapping
replication origins, but their use has technical limitations. Replication bubble structures are
scarce in a population of asynchronous cells, making their detection difficult unless the origin is
very efficient or the cell population is synchronized, not a trivial task in metazoan cell culture.
Furthermore, they require the investigator to know what genomic region to assess. Prior to the
availability of complete genome sequences, origin identification relied on the existence of welldelineated gene regions to assess, for example, where nascent DNA molecules or replication
bubbles were positioned. Thus, most of the identified origins were in regions that were well
studied with respect to the regulation of gene expression, and there were no known examples of
Figure 1-3. Experimental methods to identify replication origins.
(A) Neutral-alkaline 2-D gels are used to map origins based on nascent strand sizes. The axes
indicate the order of electrophoresis. The 2X position marks the spot of nearly
completely replicated DNA. The high molecular weight of this structure makes it the
slowest migrating (farthest to the left). When the second dimension is run using an
alkaline gel, the nascent strands separate but have a large molecular weight, nearly
identical to the parental strand.
(B) Neutral-neutral 2-D gels map origins by branch topology. The axes indicate the order of
electrophoresis. DNA containing a replication origin exhibits the pattern of the bubble
arc. When the second dimension is run using a neutral gel, origin centered DNA that has
replicated the most is also impeded the most in the gel, due to its large bubble topology.
DNA that is passively replicated by an adjacent origin exhibits the pattern of a Y arc.
(C) Repli-Seq is a variation of BrdU immunoprecipitation coupled to the precise partitioning
of cells into sample groups based on flow cytometry-measured cell cycle stage. Each of
these BrdU-IP samples is then subject to high-throughput sequencing, and the data is
assembled into mapped sequence reads throughout S phase for any genomic region. The
position of an early firing origin can be inferred through the inverted V shape of the
mapped sequence tags. Adapted from (HANSEN et al. 2010)
.
..........
..............
............................................
Figure 1-3
A. Neutral-alkaline 2-D gel analysis
-
Restriction site
Restriction site
Probe
Replication bubbles
First
1x
2X
B. Neutral-neutral 2-D gel analysis
...... First
Bubble
2X
8
1x
C. Repli-Seq
r2,
I
BrdU pulse in vivo
Sort cells into S phase
samples by FACS
N4..oun
OX5++444
Gem
early
I
U
I
I
-...
4NMLTh
C12s084
OCATI4
LRg
GI
82
IP BrdU-DNA
NOW
Tr
Twg
Sequence
S4
late
G2
-aII
4LVWM
OUKRAS
metazoan origins located in so-called gene deserts, extended regions of DNA without any known
genes.
Methods of replication origin discovery on a genome-wide scale
Though the DHFR and human p-globin locus replication origin regions were first
identified using the methods described above, these origins have been subject to many
experimental analyses to further characterize their replication properties as well as refine the
strategy of identifying new origins. One general approach is to isolate newly replicated DNA
strands and then determine to what genomic regions they map. The DNA isolation is
accomplished through primarily one of two methods. The first is to isolate nascent DNA of a
particular size, typically in the range of 0.5-2 kb, through a sucrose gradient or on an agarose gel.
An additional -exonuclease treatment removes background generated from broken DNA
fragments, as this enzyme will degrade DNA fragments not protected by a 5' RNA primer
(designated short nascent strands) (Figure 1-2E). The second method to isolate is to pulse label
cells with the nucleotide analog bromodeoxyuridine (BrdU) and immunoprecipitate newly
synthesized DNA with an anti-BrdU antibody (designated BrdU-IP DNA) (Figure 1-2D). To
determine the identity or enrichment of these molecules in the era before complete genome
sequences required the investigator to probe a specific genomic region either by hybridization to
positionally mapped DNA clones or quantitative PCR. However, the availability of complete
genome sequences allows these short nascent strands and BrdU-IP DNA to be identified using
microarrays or high-throughput sequencing.
Repli-Seq is a variation of BrdU immunoprecipitation coupled to the precise partitioning
of cells into sample groups based on flow cytometry-measured cell cycle stage (HANSEN et al.
2010). Each of these BrdU-IP samples is then subject to high-throughput sequencing, and the
data is assembled into mapped sequence reads throughout S phase for any genomic region. Data
generated from this method is shown in Figure 1-3C. Early initiating origins can be inferred
through the inverted V shape of newly synthesized DNA.
The Hamlin lab developed a method to identify origins that also uses the general
principle of isolating newly replicating DNA and then identifying these molecules on a genomewide scale. The first step exploits the property of circular DNA molecules, including replication
bubbles, to be trapped in the agarose-plugged well following electrophoresis (MESNER et al.
2006) (Figure 1-2B). This DNA can be cloned into a genomic library and identified using
microarrays or sequencing. Although the methodology is published, the microarray results to
identify these replication origins is only referenced as submitted material in a published review
article and thus will not be discussed (HAMLIN et al. 2010).
The final method to map replication origins is the genome-wide localization of ORC by
chromatin immunoprecipitation followed by hybridization to a microarray (ChIP-chip) (WYRICK
et al. 2001) or high throughput sequencing (ChIP-seq). Because ORC marks all potential sites of
replication initiation, identifying genome-wide ORC localization serves as a proxy for
uncovering all potential replication origins (Figure 1-2A).
The analyses of short nascent strands, BrdU-immunoprecipitated nascent DNA, RepliSeq, and ORC ChIP to identify replication origins has resulted in a dramatic increase in the
number of origins experimentally identified (summarized in Table 1-2). In some cases, such as
the identification of replication origins in human cells (HeLa), there was not significant overlap
among the origin datasets, suggesting that the different methods selectively identify a subset of
all origins. Nevertheless, these studies represent one to two orders of magnitude increase in the
Table 1-2. Genome-wide approaches to metazoan origin identification.
Transcription
Chromatin
Timing / Efficiency
Notes
28 new
ORIs ~ annotated genes
ORIs ~ DNaseI
hypersensitive sites
Early firing origins ~
H3acK9,14
No exonuclease
treatment
283
ORIs ~ Gene-rich GC
regions, CpG islands, cJun and c-Fos binding
sites
ORIs - DNaseI
hypersensitive sites,
H3Ac, H3K4Me2,
H3K4Me3 (though 44%
Timing independent of
origin density for select
regions examined
71% overlap Hansen
early ORIs
<14% overlap with
Karnani ORIS
Study / Approach
Origins Identified
Lucas et al, 2007
Human lymphoblastoid
cells; short nascent
strands; 1.4Mb tiling
[Inter-origin Distance]
array
Cadoret et al, 2008
HeLa cells; short
nascent strands;
ENCODE 30Mb tiling
array (1%
[1kb to 500kb, average
63kb]
show no enrichment)
genome)
Sequeira-Mendes et al,
2009
Mouse ES cells; short
nascent strands; 10.1Mb
tiling array (0.4%
97
85% map to txn units
44% map to promoters
[average 103kb]
promoter-ORIs ~
expressed genes
Select ORIs show
H3Ac, H3K4Me2,
H3K4Me3 enrichment
by ChIP-qPCR
Efficient origins ~
overlap TSS
ORIs - DNaseI
hypersensitive sites
Early firing origins ~
expressed genes
genome)
Hansen et al, 2010
Four human cell lines;
Repli-Seq; wholegenome coverage
Erythroid n/a
Lymphoid male 1131
Lymphoid female 1199
hESCs 1809
Early ORIs ~ high gene
density, gene
expression, GC content,
CpG density
Nuclear lamina
association
Fibroblast 1547
Karnani et al, 2010
HeLa cells; BrdU-IP
and short nascent
strands;
ENCODE 30Mb tiling
array (1% genome)
MacAlpine et al, 2010
Drosophila Kc167 cells;
ORC ChIP-chip and HU
BrdU-IP; whole genome
tiling arrays
BrdU-IP 815
Short nascent 320
Overlap ORIs 150
[BrdU-IP 27.6kb
Short nascent 58.4kb]
5135 ORC binding sites
(OBS)
630 early origins
[average 11kb]
68% ORIs map +/-5kb
from TSS
ORIs ~ H3Ac,
H3K4Me2, H3K4Me3
-
late firing
origins
49% early replicating
ORIs ~ RNA PolIl
binding sites
2/3 OBSs overlap with
TSS
no correlation to
individual TF binding
site
ORIs = mapped replication origin; OBS = ORC binding sites;
"~"=
OBSs - H3.3 deposition
at promoter and nonpromoter ORIs,
depletion for bulk
nucleosomes
significantly enriched for
30% early replicating
Higher density of OBS
~ early replicating
49% of genome shows
replication timing
plasticity
number of previously known metazoan replication origins and provide valuable datasets with
which to compare replication origins to genomic features such as transcription and chromatin
modifications (discussed below).
Developmental gene amplification as an origin discovery tool
Developmental gene amplification has served as an important origin discovery tool and
model for investigating the regulation of metazoan replication origins. Specific genomic regions
are amplified through replication-based mechanisms, either chromosomal excision followed by
extra-chromosomal amplification or repeated bidirectional replication from an endogenous
chromosomal locus, to increase gene copy number. Mapping of these amplification origins has
added to the catalog of known metazoan replication origins (CLAYCOMB and ORR-WEAVER
2005). Developmental gene amplification increases the DNA template to allow for sufficient
levels of gene products required at high levels in a short developmental period, such as rRNA in
frog ooctyes and cocoon proteins in Sciarid fly salivary glands.
In Drosophila, two chorion gene clusters are amplified in ovarian follicle cells, somatic
epithelial cells that surround the ooctye and secrete the components of the eggshell, by repeated
origin activation at the endogenous locus (see Figure 1-4A and 1-4B for developmental context).
This process enables the eggshell proteins to be produced and the eggshell structure to be
constructed in less than five hours. Importantly for its use as a replication model, gene
amplification uses the same replication machinery and cell cycle kinase regulation that is used in
the canonical S phase. Several female-sterile mutants have been isolated that produce a thin
eggshell phenotype due to the inadequate transcription of eggshell genes, and these mutants have
been found to contain mutations in replication factors such as Orc2, Mcm6, and Dbf4/chiffon
(LANDIs et al. 1997; LANDIS and TOWER 1999; SCHWED et al. 2002).
Figure 1-4. Gene amplification in Drosophila follicle cells as a model to study metazoan DNA
replication.
(A) Gene amplification occurs in the context of egg chamber development and maturation.
Drosophila ovaries are made of multiple ovarioles, or strings of developing egg
chambers.
(B) Adapted from (SPRADLING 1993). DAPI staining of egg chambers. Distinct egg chamber
stages can be visually distinguished by the egg chamber size, proportion of nurse cell
volume compared to the oocyte volume, and size and shape of the anterior dorsal
filaments.
(C) Schematic of gene amplification. Initiation is the repeated rounds of origin firing,
resulting in an amplified region. Elongation is the replication of existing replication forks
such that there is no increase in DNA copy number at the origin but an increase in the
flanking regions.
(D) The process of gene amplification can be visualized using immunofluorescence
experiments, monitoring the incorporation of a nucleotide analog. Fluorescence in situ
hybridization (FISH) using a genomic probe marks a specific site of amplification.
..............
..............
....
. .......
..................
. . ..........................
Figure 1-4
A. Drosophila ovaries
B. Egg chamber stages
stlOA-B
stl1
stl2
stl3
ovariole
C. Schematic of gene amplification
D. Visualizing gene amplification
Initiation
Elongation
DAPI
EdU
Genomic FISH probe
At the major chorion amplicon, Drosophila Amplicon in Follicle Cells (DAFC)-66D, the cis
requirements for amplification have been finely mapped (Figure 1-1E). DAFC-66D is an
example of a confined replication site. The majority of initiation events as determined by 2-D gel
mapping has been narrowed down to the 884 base pair element Orip located in the intergenic
region between two chorion genes. Additionally a 320 base pair enhancer element, Amplification
Control Element on the Third (ACE3), is required for amplification and has been shown to bind
ORC in vivo, though it itself does not serve as a replication origin in the endogenous locus
(CLAYCOMB and ORR-WEAVER 2005). ACE3 is proposed to serve as a nucleation point for ORC
binding, possibly by permitting a chromatin environment where pre-RCs can assemble. In
support of this model, multimers of ACE3 are capable of autonomously inducing amplification at
ectopic sites in the genome, although at lower levels than the endogenous locus (CARMINATI et
al. 1992).
Another advantage of studying replication using follicle cell amplicons is that they permit
in vivo study of replication timing. Replication events during gene amplification can be precisely
examined because the process occurs after genomic replication is shut off in development.
Developing egg chamber stages are morphologically distinct and can be isolated and analyzed
using methods such as Southern blotting, quantitative PCR, and protein localization by
immunofluorescence and chromatin immunoprecipitation. For DAFC-66D, replication initiation
events occur exclusively in stages 10 B and 11, which is followed by a period when only
replication elongation occurs in stages 12 and 13 (CLAYCOMB et al. 2002) (See Figure 1-4C for
schematic representations of initiation and elongation). The absence of initiation events in these
later stages allows the elongating replication forks to be visualized as double bar structures in
immunofluorescence experiments, monitoring the incorporation of BrdU. In contrast, at another
amplicon DAFC-62D, there are two stages of replication initiation: one in stage 10B, followed
by a second round of replication initiation in stage 13 (XIE and ORR-WEAVER 2008). Thus,
follicle cell gene amplification provides a unique opportunity to study replication events during
development.
Properties of metazoan replication origins
Sequence properties and genome distribution of replication origins
Early radiography studies revealed that DNA replication initiates from hundreds to
thousands of sites in the genome (HUBERMAN and RIGGS 1968; TAYLOR 1968), and more
recently, genome-wide studies have shown that cells utilize distinct spatial replication programs.
Origin distribution is quite variable, though the average is similar to previously determined
estimates of replicon size (approximately 50 kb). Inter-origin distance can range from 1 to 500
kb, with sparse origin distribution in gene poor regions. Origin density is strongly correlated with
gene density and the related measure of high GC content (CADORET et al. 2008). This correlation
of origin density to high GC content appears to result from replication origins frequently
overlapping with transcription units (discussed below). At the level of individual origins,
Karnani et al found that in human cells there was an enrichment of AT sequences in the region
100 base pairs to each side of the replication peak, consistent with previous analyses of
individual metazoan origins (KARNANI et al. 2010).
A study in Drosophila cells found that, although there was no simple consensus motif for
ORC binding, machine learning approaches could be applied to discriminate between ORCbound and non ORC-bound sequences (MACALPINE et al. 2010), suggesting that sequence likely
plays some indirect role in promoting or permitting ORC binding. For example, it appears that
DNA topology is an important determinant of ORC binding in vitro, as ORC binds preferentially
to superhelical DNA without sequence specificity (REMUS et al. 2004), and the short sequences
found to be enriched in the MacAlpine study may promote this DNA structure.
Two studies examined whether replication origins mapped to evolutionary conserved
regions (CR), and both observed significant enrichment (CADORET et al. 2008; KARNANI et al.
2010). Karnani et al observed that 50% of their identified origins overlapped with conserved
elements defined by the Encyclopedia of DNA Elements (ENCODE) consortium of
investigators. Cadoret et al found that 70% of their identified origins overlapped with CR's,
lower than the overlap of protein coding exons with CR's (86%) but comparable to promoter
regions (72%). It is unclear how much of this evolutionary constraint is due primarily to the
replication, transcription, and/or epigenetic functions of these conserved sequences, though each
is likely to contribute to the observed conservation.
Replication and transcription
The relationship between replication initiation and transcription is an area of active
investigation, and examples from many systems show both negative and positive effects of
transcription on replication. Because DNA replication and RNA transcription share the same
template, one possibility is that transcriptional elongation poses a steric inhibition on possible
origin licensing or activation. In the DHFR locus, replication initiation normally occurs from one
of several potential sites in the 50 kb intergenic spacer between the DHFR gene and the
downstream 2BE2121 gene. When transcription is extended into the non-transcribed spacer via
deletion of 3' processing signals, replication initiation is suppressed in the intergenic region and
confined to the region immediately adjacent to 2BE2121 (MESNER and HAMLIN 2005). In
contrast, when the DHFR promoter is deleted, initiation events can be detected within the DHFR
gene, though with lower overall efficiency throughout the region (KALEJTA et al. 1998).
There are also several specific examples of a positive relationship of transcription of
replication initiation. In some cases, transcriptional regulatory elements are necessary for
replication initiation. The LCR in the human P-globin locus and transcription factor binding sites
in the c-myc locus play important roles in promoting replication initiation (ALADJEM et al. 1995;
Liu et al. 2003). In a more direct example, pre-loading transcription factors onto DNA resulted
in site-specific replication in Xenopus eggs (DANIS et al. 2004). Histone modifications
associated with open and active chromatin, specifically acetylated H3, were localized to this
region and are likely to contribute more directly to origin firing than transcription factor binding
itself (replication and histone modifications discussed below).
Despite the diverse methods, one of the most striking findings of recent genome-wide
mapping studies was the significant number of origins that corresponded to gene regions and in
particular, the transcription start sites (TSS) of active genes. Although examples of replication
origins coinciding with transcription units or promoters such as the human myc locus and human
lamin B2 locus were previously known, it was unclear how representative these individual
examples were of all replication origins. In mouse ES cells, 85% of origins map to transcription
units and 44% specifically to promoter regions (SEQUEIRA-MENDES et al. 2009). In comparison
to all promoters, these promoter origins were significantly enriched in cap analysis gene
expression (CAGE) tags, which mark the 5' end of mRNAs, derived from early embryos. This
result suggests that origins correspond to actively transcribed promoters. Furthermore, the
promoter-associated origins were found to be the most efficient as assessed by abundance of
nascent strands quantified using qPCR (SEQUEIRA-MENDES et al. 2009). The mouse ES cell
results are very consistent with work from human cells, where 68% of origins were located 5 kb
up or downstream of a TSS. In addition, origins were significantly enriched near sites of RNAPII
binding (KARNANI et al. 2010). Furthermore, in Drosophila cell culture, two-thirds of ORC
binding sites were found to overlap with TSS's, primarily at actively transcribed genes
(MAcALPINE et al. 2010).
The co-localization of replication origins with actively transcribed promoters raises the
possibility that specific transcription factors contribute to origin function at many initiation sites.
Cadoret et al found that replication origins in human cells were significantly enriched in binding
sites for c-JUN and c-FOS (CADORET et al. 2008), which together form the AP-I complex and
regulate a variety of cellular processes such as proliferation, differentiation, and apoptosis. The
c-Myc protein has been shown to bind to its own promoter, and mutations in this sequence
abolish replication in a plasmid replication assay (ARIGA and IGUCHI-ARIGA 1989). However,
Cadoret et al found no significant enrichment of c-Myc binding sites in origins identified in their
study (CADORET and PRIOLEAU 2010), and it is possible that a direct role of this protein in
replication initiation is limited to the c-Myc locus or a small subset of replication origins. With
over 5000 ORC binding sites to query, MacAlpine et al reasoned they would observe conserved
transcription factor binding motifs, or enrichment of specific functional gene categories, if a
select group of transcription factors were responsible for ORC localization (MAcALPINE et al.
2010). However, they did not observe either of these possibilities, suggesting that specific
transcription factors are not generally responsible for origin specification. Instead, specific
transcription factors may regulate replication at a small subset of origins.
Studies using Drosophila follicle cell gene amplification as a model system have revealed
a direct role of transcription factors on DNA replication initiation. The chorion amplicon DAFC66D is regulated by the E2F, Myb, and Rb complexes (BEALL et al. 2004; Bosco et al. 2001).
For example, hypomorphic mutations in Rb or E2f1 mutations that cannot bind Rb result in
inappropriate genomic replication during amplification stages (Bosco et al. 2001). This result
supports a model where E2F1/Rb directly represses replication at DAFC-66D, which is
independent of transcriptional regulation, until the appropriate developmental time when Rb is
phosphorylated and E2F 1 can positively influence amplification. These complexes have been
localized to the amplicon by chromatin immunoprecipitation and shown to physically interact
with ORC. An interaction between Rb and replication initation sites has been reported in
mammalians cells with Rb localizing to initiation sites after DNA damage to repress replication
(AvNI et al. 2003). There is also evidence that the insect molting hormone ecdsyone regulates
amplification as dominant negative mutants of the ecdysone receptor (EcR) display reduced
amplification (HACKNEY et al. 2007). These results are consistent with gene amplification in the
salivary gland of Sciara coprophila,where ecdysone treatment can induce premature
amplification (FOULK et al. 2006). This study also demonstrated that ScEcR binds this
amplification origin in vitro at a putative ecdysone response element.
Analysis of the follicle cell amplicon DAFC-62D has revealed a relationship between
transcription and MCM loading. DAFC-62D exhibits two separate stages of replication initiation,
with a period of elongation in between. By culturing egg chambers in the presence of the drug camanitin, which inhibits RNA polymerase II dependent transcription, the second round of
replication initiation was specifically inhibited (XIE and ORR-WEAVER 2008). Transcription
inhibition had no effect on ORC localization but specifically inhibited MCM loading at this
second stage of initiation. A direct physical interaction between RNAPII and the MCM complex
has been reported in yeast raising the possibility that the transcriptional machinery may also
function to promote MCM loading at some replication origins (GAUTHIER et al. 2002; HOLLAND
et al. 2002).
Up to one-third of origins are not associated with known promoters (KARNANI et al.
2010; MAcALPINE et al. 2010), indicating that there are other mechanisms of origin specification
that do not involve transcription. Indeed, if replication could only initiate from the site of active
transcription, gene-desert regions would be in serious risk of not being fully replicated every cell
cycle. Conversely, not all active promoters correspond to replication initiation sites. It remains to
be seen what properties permit ORC to bind to active promoter sequences and function as
replication origins, though open and active chromatin is one good candidate (discussed below).
Furthermore, one advantage of having DNA replication and transcription initiation occur from
the same location is that this configuration minimizes the likelihood of head-to-head collisions of
the replication and transcription machineries and disruption of both processes. Gene distribution
studies in bacteria have revealed that 90% of essential genes in B. subtilis and 70% of essential
genes in E. coli are oriented so that DNA replication and transcription are co-directional (MIRKIN
and MIRKIN 2005). It is possible that potentially deleterious consequences of head-on polymerase
collisions are reduced by gene orientation in bacteria, where there is just one replication origin.
In the larger genomes of higher eukaryotes, the coincidence of replication and transcription start
sites may prevent polymerase collision immediately at the initiation site and enable flexibility of
spatial replication programs depending on the developmental stage and cell type.
In contrast to metazoans, DNA replication initiation sites are located primarily in
intergenic regions in budding and fission yeast (RAGHURAMAN et al. 2001; SEGURADO et al.
2003). Additionally, a comparative study of replication origins in Saccharomyces yeast species
revealed a significant enrichment of replication origins between convergent transcription units,
and when located between tandem transcription units, the replication origin was observed closer
to the transcriptional terminator than the promoter (NIEDUSZYNSKI et al. 2006). One possible
explanation for this difference from metazoan replication origins is that the well-defined nature
of ORC binding in yeast may constrain origin positioning, whereas multicellular organisms
require greater flexibility in origin usage and thus have replication origins and active promoters
overlap to coordinate replication and transcription.
Replication and chromatin context
DNA replication occurs in the context of chromatin, the combination of DNA and
associated histone proteins around which DNA wraps to form nucleosomes. The N-terminal tails
of histones are subject to a number of covalent modifications such as acetylation, methylation,
and phosphorylation, which induce chromosomal changes that either promote or inhibit various
genomic processes such as transcription, replication, and recombination (KOUZARIDES 2007).
These histone modifications have been best characterized with regard to transcription, and there
are several well-defined marks characteristic of actively transcribed or repressed chromatin.
However, there is also an interest in the relationship between replication and chromatin state.
Recent work in budding yeast purifying histone proteins around a single origin and performing
high-resolution mass spectrometry to identify all histone modifications throughout the cell cycle
has revealed dynamic acetylation patterns of histone H3 and H4 (UNNIKRISHNAN et al. 2010).
Multiply acetylated H3 and H4 are required for efficient origin activation during S phase.
Additionally, deletion of the histone deacetylase Rpd3 in budding yeast has been shown to result
in early activation of late origins at non-telomere positions (KNOTT et al. 2009).
Two independent studies have found enrichment of hyperacetylated H3 and H4 at
Drosophila follicle cell amplicons (AGGARWAL and CALVI 2004; HARTL et al. 2007). H4
acetylation did not co-localize with elongating replicating forks, indicating that this modification
is associated with replication initiation and not histone deposition at newly replicated DNA
(HARTL et al. 2007). Loss-of-function mutant clones of the histone deacetylase Rpd3 resulted in
increased acetylation levels and showed increased genomic replication in amplification stage egg
chambers. Furthermore, using a reporter construct, follicle cell amplification could be inhibited
by tethering Rpd3 to the region (AGGARWAL and CALVI 2004).
Genome-wide mapping studies found origins to be enriched for specific active histone
marks such as H3K4 dimethylation, H3K4 trimethylation, and H3 acetylation (CADORET et al.
2008; KARNANI et al. 2010). In addition, ORC binding sites in Drosophila cell culture are
significantly enriched for the histone variant H3.3, which marks active promoters and regulatory
sequences (MAcALPINE et al. 2010). Notably, the enrichment of H3.3 is also found at ORC
binding sites not associated with active transcription, indicating that this mark may be more
general to replication origins in Drosophila and not just the promoter associated ones. Currently,
it remains unclear whether ORC binding is an indirect consequence of local chromatin structure
or whether ORC localization is somehow actively regulated. The observation that many regions
of active, open chromatin do not contain replication origins or bind ORC implies that there are
likely to be additional mechanisms to regulate ORC binding to specific sites. As more functional
elements of the genome are mapped, there will be greater understanding of which of these
features are linked to DNA replication initiation.
An attractive hypothesis for why replication origins are significantly enriched for TSS's
is that active transcription necessitates or creates an open chromatin state that is also required for
ORC binding or origin activation. Multiple genome-wide studies have found that the origin
datasets identified are significantly enriched for DNaseI hypersensitive sites (CADORET et al.
2008; HANSEN et al. 2010; LUCAS et al. 2007). DNA regions are hypersensitive to cleavage by
DNaseI when not wrapped in the nucleosome, as when transcription factors displace histone
octamers. In Drosophila cell culture, ORC binding sites are significantly depleted for bulk
nucleosomes (MACALPINE et al. 2010). These results suggest that accessible DNA may be
important for ORC binding or some other step of replication initiation. In budding yeast,
nucleosomes are excluded from the ACS and B elements of ARSI (LIPFORD and BELL 2001).
Recent studies have demonstrated that origin sequence is sufficient to maintain nucleosome free
origins on a genome-wide scale, although ORC binding is required for the precise nucleosomal
positioning surrounding the origin (EATON et al. 2010). One possibility is that, in budding yeast,
the nucleosome free region established by origin sequence, in concert with additional
mechanisms, is necessary for ORC binding. In metazoans, where there is no consensus origin
motif, ORC binding may rely on nucleosome-free DNA established by other means, such as at
promoters during transcription.
Replication timing and cell type specific replication programs
Cells have a distinct temporal replication program. That is, there are regions that are
consistently replicated early in S phase and others that are replicated late in S phase, which can
be assessed across the genome. Work in Drosophila and mammalian cell culture, though not in
yeast, shows a correlation between early origin firing and active transcription (MAcALPINE et al.
2004; WHITE et al. 2004). This relationship appears to hold for large zones and not necessarily
the level of individual genes. Furthermore, the two alleles of the same gene can display
asynchronous replication timing in the case of imprinted genes where the expressed allele
replicates earlier than the silent allele (SINGH et al. 2003). Several reviews discuss the
relationship of replication timing and transcription based on analyses of model replication origins
(ALADJEM 2007; HIRATANI et al. 2009). This section will focus on relevant findings regarding
replication and transcription from the genome-wide mapping experiments.
The study in Drosophila cell culture identified 630 hydroxyurea-resistant early origins by
BrdU immunoprecipitation, and higher density of ORC binding corresponded to earlier
replication time (MACALPINE et al. 2010), but Cadoret et al found that replication timing was
independent of origin density (CADORET et al. 2008), which may be due to the limited number of
genomic regions they assessed. Alternatively, the distinction of ORC binding density versus
origin density may be significant. High density of ORC binding may increase the likelihood that
replication will initiate early at this region because there are more potential origins that can be
activated. In contrast, the clustering of origins is not necessarily indicative of replication timing.
A comprehensive study of replication timing comes from the Repli-Seq method, where
there is whole-genome data of newly replicated DNA at six time points spanning the G 1/S
transition to the end of S phase (HANSEN et al. 2010). Furthermore, because multiple cell types
were used, the relationship of cell type specific transcription and DNA replication could be
investigated. Similar to previous studies, regions were identified where a gene that is exclusively
expressed in one cell type is early replicating, but it is late replicating in the cell types where it is
not expressed (HATTON et al. 1988). Additional examples of asynchronous replication based on
allelic expression were also identified. In terms of replication features, constant early replication
regions are associated with high gene density, gene expression, Alu density, GC content, and
CpG density, all of which are consistent with the previously reported findings of transcription
and early replication. In addition, genomic regions associated with the nuclear lamina, which are
physically segregated from the internal nuclear compartment, are late replicating.
By looking at pair-wise combinations of cell types for differences in replication timing,
Hansen et al observed that 50% of the human genome displayed plasticity, or variability in
replication timing (HANSEN et al. 2010). One intriguing hypothesis is that this plasticity is due to
the different gene expression patterns among the four cell types, and genomic regions containing
genes that are more similarly expressed in terms of timing and abundance may display uniform
replication timing among cell types. Additional analysis will be required to investigate this
possibility.
Despite the abundance of information that recent genome-wide mapping studies have
provided, one of their common limitations is that the experiments are performed in cell culture,
where replication events cannot be directly studied at the time of origin firing. Gene
amplification in Drosophila follicle cells offers a powerful model for metazoan DNA replication
without these limitations. Also, regarding the question of replication plasticity, the mapped
amplification origins do not correspond to origins used in the canonical S phase, which provides
a model system to examine cell type specific replication programs and why replication origins
used in one cell type are or are not used in other cell types.
The importance of cataloging replication origins
Experimental innovations made possible by complete genome sequences have led to an
approximate 100-fold increase in the number of identified metazoan replication origins. With
improved technology and reduction in the cost of high-throughput sequencing to apply to wholegenome experiments, hundreds, if not thousands, more replication origins will be identified. This
welcomes an evaluation of why identifying replication origins is important and what additional
information will be gained from this endeavor. There are at least two motivations. First,
identifying replication origins on a genome-wide scale will provide a picture of how
chromosomes are normally replicated in S phase: what portion of the genome shows origin
clustering versus dispersed distribution of origins? What areas of the genome show large interorigin distance? How large can inter-origin distance maximally be? A catalog of replication
origins will enable a more direct investigation of the mechanisms regulating replication initiation
for a diverse set of origins.
Second, once we know what the typical spatial replication program looks like for a given
cell type, we can examine the situation in abnormal or disease-causing states. For example,
recent work sequencing cancer genomes has revealed that amplification of gene regions and
chromosomal rearrangements is widespread (CAMPBELL et al. 2008; STEPHENS et al. 2009). In
addition, cancer cells often show de-regulation of replication components (PETROPOULOU et al.
2008). One possibility is that re-replication at origins results in increased DNA copy levels that
are subsequently retained in the chromosome, possibly via recombination or non-homologous
end joining at sites of DNA breaks. It will be interesting to see if the regions most susceptible to
amplification in cancer cells also contain replication origins in the untransformed cell state. It
also will be important to assess whether common amplifications or rearrangements found in
cancer cells are due to the distinct spatial and temporal replication program of the cell type from
which they originate.
For regions of low origin density, how do cells ensure the region will be fully replicated?
During the cell cycle, the entire genome must be fully replicated for each daughter cell to inherit
a complete and accurate copy of the genome. Regions where origins are far apart may be
vulnerable to incomplete replication if replication forks collapse and cannot restart or if no other
origins are activated in the intervening region. Studies have shown that replicon size can be as
large as 500kb (CADORET et al. 2008). Future studies can examine whether regions of sparse
origin density are particularly susceptible to chromosomal loss. Potentially vulnerable regions of
sparse origin density can be used as models to study how intra-S phase checkpoints are activated
when genomic replication is incomplete.
Summary of thesis
This thesis investigates the properties and regulation of metazoan DNA replication
origins using Drosophila follicle cell gene amplification as a model system. We integrate
genome-wide analyses with detailed molecular characterization of individual origins to elucidate
the properties that enable specific genomic regions to serve as replication origins. In Chapter 2,
we use a comparative genomic hybridization strategy to identify all of the follicle cell amplicons
and identify two new follicle cell amplicons. As gene amplification is typically a strategy to
augment gene expression, we examine the relationship between the amplified regions and gene
expression. We also determine the localization of ORC on a genome-wide scale. We confirm that
H4 acetylation is enriched in amplified regions but find that it is not deterministic of
amplification. Instead, levels of H4 acetylation appear to correspond to the magnitude of gene
amplification. We also identify an amplicon, DAFC-22B, that displays strain-specific
amplification and use it to study the requirements of ORC binding.
In Chapter 3, we characterize DAFC-34B in detail. We find that amplification may not fit
the model of augmenting gene expression as one of the genes in the region, encoding a vitelline
envelope protein of the eggshell, is expressed prior to amplification stages whereas another gene
is expressed in only a small subset of follicle cells. DAFC-34B displays two rounds of gene
amplification, but unlike the previously characterized DAFC-62D, the second round of
replication initiation is not dependent on transcription. We map the amplification origin to a 1 kb
region using nascent strand analysis and find that the origin corresponds to the transcription unit
of the vitelline membrane gene. We find that ORC binds in a broad 10 kb zone at the first stage
of amplification but is surprisingly absent in subsequent stages, despite a second round of
replication initiation. We determine the cis requirements for amplification and find that a 6 kb
region is sufficient for amplification at an ectopic site. This work highlights the power of using
Drosophila follicle cell amplification to study replication as both genome-wide and individual
analyses of replication origins can be performed to study what properties confer origin function
and what strategies are used to regulate replication initiation.
REFERENCES
B. D., and B. R. CALVI, 2004 Chromatin regulates origin activity in Drosophila
follicle cells. Nature 430: 372-376.
ALADJEM, M. I., 2007 Replication in context: dynamic regulation of DNA replication patterns in
metazoans. Nat Rev Genet 8: 588-600.
ALADJEM, M. I., A. FALASCHI and D. KOWALSKI, 2006 Eukaryotic DNA Replication Origins in
DNA Replication and Human Disease, edited by M. L. DEPAMPHILIS. Cold Spring
Harbor Laboratory Press, Cold Spring Harbor, NY.
ALADJEM, M. I., M. GROUDINE, L. L. BRODY, E. S. DIEKEN, R. E. FOURNIER et al., 1995
Participation of the human beta-globin locus control region in initiation of DNA
replication. Science 270: 815-819.
AGGARWAL,
ALADJEM, M. I., L. W. RODEWALD, J. L. KOLMAN and G. M. WAHL, 1998 Genetic dissection of a
mammalian replicator in the human beta-globin locus. Science 281: 1005-1009.
ARIAS, E. E., and J. C. WALTER, 2007 Strength in numbers: preventing rereplication via multiple
mechanisms in eukaryotic cells. Genes Dev 21: 497-518.
ARIGA, H., and M. M. IGUCHI-ARIGA, 1989 [DNA replication and RNA transcription regulated
by c-myc protein]. Tanpakushitsu Kakusan Koso 34: 1163-1174.
AVNI, D., H. YANG, F. MARTELLI, F. HOFMANN, W. M. ELSHAMY et al., 2003 Active localization
of the retinoblastoma protein in chromatin and its response to S phase DNA damage. Mol
Cell 12: 735-746.
BEALL, E. L., M. BELL, D. GEORLETTE and M. R. BOTCHAN, 2004 Dm-myb mutant lethality in
Drosophila is dependent upon mip130: positive and negative regulation of DNA
replication. Genes Dev 18: 1667-1680.
BELL, S. P., and A. DUTTA, 2002 DNA replication in eukaryotic cells. Annu Rev Biochem 71:
333-374.
BELL, S. P., and B. STILLMAN, 1992 ATP-dependent recognition of eukaryotic origins of DNA
replication by a multiprotein complex. Nature 357: 128-134.
Bosco, G., W. Du and T. L. ORR-WEAVER, 2001 DNA replication control through interaction of
E2F-RB and the origin recognition complex. Nat Cell Biol 3: 289-295.
BREWER, B. J., and W. L. FANGMAN, 1987 The localization of replication origins on ARS
plasmids in S. cerevisiae. Cell 51: 463-471.
CADORET, J. C., F. MEISCH, V. HASSAN-ZADEH, I. LUYTEN, C. GUILLET et al., 2008 Genomewide studies highlight indirect links between human replication origins and gene
regulation. Proc Natl Acad Sci U S A 105: 15837-15842.
CADORET, J. C., and M. N. PRIOLEAU, 2010 Genome-wide approaches to determining origin
distribution. Chromosome Res 18: 79-89.
CAMPBELL, P. J., P. J. STEPHENS, E. D. PLEASANCE, S. O'MEARA, H. LI et al., 2008 Identification
of somatically acquired rearrangements in cancer using genome-wide massively parallel
paired-end sequencing. Nat Genet 40: 722-729.
CARMINATI, J. L., C. G. JOHNSTON and T. L. ORR-WEAVER, 1992 The Drosophila ACE3 chorion
element autonomously induces amplification. Mol Cell Biol 12: 2444-2453.
CLAYCOMB, J. M., D. M. MACALPINE, J. G. EVANS, S. P. BELL and T. L. ORR-WEAVER, 2002
Visualization of replication initiation and elongation in Drosophila. J Cell Biol 159: 225236.
CLAYCOMB, J. M., and T. L. ORR-WEAVER, 2005 Developmental gene amplification: insights
into DNA replication and gene expression. Trends Genet 21: 149-162.
DANIS, E., K. BRODOLIN, S. MENUT, D. MAIORANO, C. GIRARD-REYDET et al., 2004
Specification of a DNA replication origin by a transcription complex. Nat Cell Biol 6:
721-730.
EATON, M. L., K. GALANI, S. KANG, S. P. BELL and D. M. MACALPINE, 2010 Conserved
nucleosome positioning defines replication origins. Genes Dev 24: 748-753.
FANGMAN, W. L., and B. J. BREWER, 1991 Activation of replication origins within yeast
chromosomes. Annu Rev Cell Biol 7: 375-402.
FOULK, M. S., C. LIANG, N. Wu, H. G. BLITZBLAU, H. SMITH et al., 2006 Ecdysone induces
transcription and amplification in Sciara coprophila DNA puff II/9A. Dev Biol 299: 151163.
GAUTHIER, L., R. DZIAK, D. J. KRAMER, D. LEISHMAN, X. SONG et al., 2002 The role of the
carboxyterminal domain of RNA polymerase II in regulating origins of DNA replication
in Saccharomyces cerevisiae. Genetics 162: 1117-1129.
GILBERT, D., and S. N. COHEN, 1989 Autonomous replication in mouse cells: a correction. Cell
56: 143-144.
HACKNEY, J. F., C. PUCCI, E. NAES and L. DOBENS, 2007 Ras signaling modulates activity of the
ecdysone receptor EcR during cell migration in the Drosophila ovary. Dev Dyn 236:
1213-1226.
HAMLIN, J. L., L. D. MESNER and P. A. DIJKWEL, 2010 A winding road to origin discovery.
Chromosome Res 18: 45-61.
HANSEN, R. S., S. THOMAS, R. SANDSTROM, T. K. CANFIELD, R. E. THURMAN et al., 2010
Sequencing newly replicated DNA reveals widespread plasticity in human replication
timing. Proc Natl Acad Sci U S A 107: 139-144.
HARLAND, R. M., and R. A. LASKEY, 1980 Regulated replication of DNA microinjected into eggs
of Xenopus laevis. Cell 21: 761-771.
HARTL, T., C. BOSWELL, T. L. ORR-WEAVER and G. BoSCo, 2007 Developmentally regulated
histone modifications in Drosophila follicle cells: initiation of gene amplification is
associated with histone H3 and H4 hyperacetylation and HI phosphorylation.
Chromosoma 116: 197-214.
HATTON, K. S., V. DHAR, E. H. BROWN, M. A. IQBAL, S. STUART et al., 1988 Replication
program of active and inactive multigene families in mammalian cells. Mol Cell Biol 8:
2149-2158.
HEINTZ, N. H., and J. L. HAMLIN, 1982 An amplified chromosomal sequence that includes the
gene for dihydrofolate reductase initiates replication within specific restriction fragments.
Proc Natl Acad Sci U S A 79: 4083-4087.
2009 Replication timing and
transcriptional control: beyond cause and effect--part II. Curr Opin Genet Dev 19: 142149.
HIRATANI, I., S. TAKEBAYASHI, J. Lu and D. M. GILBERT,
HOLLAND, L., L. GAUTHIER, P. BELL-ROGERS and K. YANKULOV, 2002 Distinct parts of
minichromosome maintenance protein 2 associate with histone H3/H4 and RNA
polymerase II holoenzyme. Eur J Biochem 269: 5192-5202.
HUBERMAN, J. A., and A. D. RIGGS, 1968 On the mechanism of DNA replication in mammalian
chromosomes. J Mol Biol 32: 327-341.
HUBERMAN, J. A., L. D. SPOTILA, K. A. NAWOTKA, S. M. EL-ASSOULI and L. R. DAVIS, 1987 The
in vivo replication origin of the yeast 2 microns plasmid. Cell 51: 473-481.
JACOB, F., and S. BRENNER, 1963 [On the regulation of DNA synthesis in bacteria: the
hypothesis of the replicon.]. C R Hebd Seances Acad Sci 256: 298-300.
KALEJTA, R. F., X. LI, L. D. MESNER, P. A. DIJKWEL, H. B. LIN et al., 1998 Distal sequences, but
not ori-beta/OBR- 1, are essential for initiation of DNA replication in the Chinese hamster
DHFR origin. Mol Cell 2: 797-806.
KARNANI, N., C. M. TAYLOR, A. MALHOTRA and A. DUTTA, 2010 Genomic study of replication
initiation in human chromosomes reveals the influence of transcription regulation and
chromatin structure on origin selection. Mol Biol Cell 21: 393-404.
KITSBERG, D., S. SELIG, I. KESHET and H. CEDAR, 1993 Replication structure of the human betaglobin gene domain. Nature 366: 588-590.
KNOTT, S. R., C. J. VIGGIANI, S. TAVARE and 0. M. APARICIO, 2009 Genome-wide replication
profiles indicate an expansive role for Rpd3L in regulating replication initiation timing or
efficiency, and reveal genomic loci of Rpd3 function in Saccharomyces cerevisiae. Genes
Dev 23: 1077-1090.
KOUZARIDES, T., 2007 Chromatin modifications and their function. Cell 128: 693-705.
KRYSAN, P. J., and M. P. CALOS, 1991 Replication initiates at multiple locations on an
autonomously replicating plasmid in human cells. Mol Cell Biol 11: 1464-1472.
KRYSAN, P. J., J. G. SMITH and M. P. CALOS, 1993 Autonomous replication in human cells of
multimers of specific human and bacterial DNA sequences. Mol Cell Biol 13: 26882696.
LANDIS, G., R. KELLEY, A. C. SPRADLING and J. TOWER, 1997 The k43 gene, required for
chorion gene amplification and diploid cell chromosome replication, encodes the
Drosophila homolog of yeast origin recognition complex subunit 2. Proc Natl Acad Sci U
S A 94: 3888-3892.
LANDIS, G., and J. TOWER, 1999 The Drosophila chiffon gene is required for chorion gene
amplification, and is related to the yeast Dbf4 regulator of DNA replication and cell
cycle. Development 126: 4281-4293.
LIPFORD, J. R., and S. P. BELL, 2001 Nucleosomes positioned by ORC facilitate the initiation of
DNA replication. Mol Cell 7: 21-30.
LIu, G., M. MALOTT and M. LEFFAK, 2003 Multiple functional elements comprise a Mammalian
chromosomal replicator. Mol Cell Biol 23: 1832-1842.
LUCAS, I., A. PALAKODETI, Y. JIANG, D. J. YOUNG, N. JIANG et al., 2007 High-throughput
mapping of origins of replication in human cells. EMBO Rep 8: 770-777.
MACALPINE, D. M., H. K. RODRIGUEZ and S. P. BELL, 2004 Coordination of replication and
transcription along a Drosophila chromosome. Genes Dev 18: 3094-3105.
MACALPINE, H. K., R. GORDAN, S. K. POWELL, A. J. HARTEMINK and D. M. MACALPINE, 2010
Drosophila ORC localizes to open chromatin and marks sites of cohesin complex
loading. Genome Res 20: 201-211.
MASUKATA, H., H. SATOH, C. OBUSE and T. OKAZAKI, 1993 Autonomous replication of human
chromosomal DNA fragments in human cells. Mol Biol Cell 4: 1121-1132.
MELIXETIAN, M., and K. HELIN, 2004 Geminin: a major DNA replication safeguard in higher
eukaryotes. Cell Cycle 3: 1002-1004.
L. D., E. L. CRAWFORD and J. L. HAMLIN, 2006 Isolating apparently pure libraries of
replication origins from complex genomes. Mol Cell 21: 719-726.
MESNER, L. D., and J. L. HAMLIN, 2005 Specific signals at the 3' end of the DHFR gene define
one boundary of the downstream origin of replication. Genes Dev 19: 1053-1066.
MESNER, L. D., X. LI, P. A. DIJKWEL and J. L. HAMLIN, 2003 The dihydrofolate reductase origin
of replication does not contain any nonredundant genetic elements required for origin
activity. Mol Cell Biol 23: 804-814.
MILBRANDT, J. D., N. H. HEINTZ, W. C. WHITE, S. M. ROTHMAN and J. L. HAMLIN, 1981
Methotrexate-resistant Chinese hamster ovary cells have amplified a 135-kilobase-pair
region that includes the dihydrofolate reductase gene. Proc Natl Acad Sci U S A 78:
6043-6047.
MIRKIN, E. V., and S. M. MIRKIN, 2005 Mechanisms of transcription-replication collisions in
bacteria. Mol Cell Biol 25: 888-895.
MESNER,
NIEDUSZYNSKI, C. A., Y. KNOX and A. D. DONALDSON, 2006 Genome-wide identification of
replication origins in yeast by comparative genomics. Genes Dev 20: 1874-1879.
OKUNO, Y., T. OKAZAKI and H. MASUKATA, 1997 Identification of a predominant replication
origin in fission yeast. Nucleic Acids Res 25: 530-537.
PETROPOULOU, C., P. KOTANTAKI, D. KARAMITROS and S. TARAVIRAS, 2008 Cdtl and Geminin
in cancer: markers or triggers of malignant transformation? Front Biosci 13: 4485-4494.
RAGHURAMAN, M. K., E. A. WINZELER, D. COLLINGWOOD, S. HUNT, L. WODICKA et al., 2001
Replication dynamics of the yeast genome. Science 294: 115-121.
RAO, H., and B. STILLMAN, 1995 The origin recognition complex interacts with a bipartite DNA
binding site within yeast replicators. Proc Natl Acad Sci U S A 92: 2224-2228.
REMUS, D., E. L. BEALL and M. R. BOTCHAN, 2004 DNA topology, not DNA sequence, is a
critical determinant for Drosophila ORC-DNA binding. Embo J 23: 897-907.
SCHAARSCHMIDT, D., J. BALTIN, I. M. STEHLE, H. J. LIPPs and R. KNIPPERS, 2004 An episomal
mammalian replicon: sequence-independent binding of the origin recognition complex.
Embo J 23: 191-201.
SCHWED, G., N. MAY, Y. PECHERSKY and B. R. CALVI, 2002 Drosophila minichromosome
maintenance 6 is required for chorion gene amplification and genomic replication. Mol
Biol Cell 13: 607-620.
SEGURADO, M., A. DE LUIS and F. ANTEQUERA, 2003 Genome-wide distribution of DNA
replication origins at A+T-rich islands in Schizosaccharomyces pombe. EMBO Rep 4:
1048-1053.
SEQUEIRA-MENDES, J., R. DIAZ-URIARTE, A. APEDAILE, D. HUNTLEY, N. BROCKDORFF et al.,
2009 Transcription initiation activity sets replication origin efficiency in mammalian
cells. PLoS Genet 5: e1000446.
SINGH, N., F. A. EBRAHIMI, A. A. GIMELBRANT, A. W. ENSMINGER, M. R. TACKETT et al., 2003
Coordination of the random asynchronous replication of autosomal loci. Nat Genet 33:
339-341.
SPRADLING, A., 1993 Developmental Genetics of Oogenesis in The Development ofDrosophila
melanogaster,edited by M. BATE and A. MARTINEz ARIAS. Cold Spring Harbor
Laboratory Press, Cold Spring Harbor, NY.
STEPHENS, P. J., D. J. MCBRIDE, M. L. LIN, I. VARELA, E. D. PLEASANCE et al., 2009 Complex
landscapes of somatic rearrangement in human breast cancer genomes. Nature 462: 10051010.
STINCHCOMB, D. T., M. THOMAS, J. KELLY, E. SELKER and R. W. DAVIS, 1980 Eukaryotic DNA
segments capable of autonomous replication in yeast. Proc Natl Acad Sci U S A 77:
4559-4563.
TANAKA, S., T. UMEMORI, K. HIRAI, S. MURAMATSU, Y. KAMIMURA et al., 2007 CDK-dependent
phosphorylation of Sld2 and Sld3 initiates DNA replication in budding yeast. Nature 445:
328-332.
TAYLOR, J. H., 1968 Rates of chain growth and units of replication in DNA of mammalian
chromosomes. J Mol Biol 31: 579-594.
THOMER, M., N. R. MAY, B. D. AGGARWAL, G. KwOK and B. R. CALVI, 2004 Drosophila double-
parked is sufficient to induce re-replication during development and is regulated by
cyclin E/CDK2. Development 131: 4807-4818.
UNNIKRISHNAN, A., P. R. GAFKEN and T. TSUKIYAMA, 2010 Dynamic changes in histone
acetylation regulate origins of DNA replication. Nat Struct Mol Biol 17: 430-437.
WALTER, J. C., and H. ARAKI, 2006 Activation of Pre-replication Complexes in DNA Replication
and Human Disease, edited by M. L. DEPAMPHILIS. Cold Spring Harbor Laboratory
Press, Cold Spring Harbor, NY.
WHITE, E. J., 0. EMANUELSSON, D. SCALZO, T. ROYCE, S. KOSAK et al., 2004 DNA replicationtiming analysis of human chromosome 22 at high resolution and different developmental
states. Proc Natl Acad Sci U S A 101: 17771-17776.
WYRICK, J. J., J. G. APARICIO, T. CHEN, J. D. BARNETT, E. G. JENNINGS et al., 2001 Genomewide distribution of ORC and MCM proteins in S. cerevisiae: high-resolution mapping of
replication origins. Science 294: 2357-2360.
XIE, F., and T. L. ORR-WEAVER, 2008 Isolation of a Drosophila amplification origin
developmentally activated by transcription. Proc Natl Acad Sci U S A 105: 9651-9656.
YANOW, S. K., Z. LYGEROU and P. NURSE, 2001 Expression of Cdc 18/Cdc6 and Cdtl during G2
phase induces initiation of DNA replication. Embo J 20: 4648-4656.
ZEGERMAN, P., and J. F. DIFFLEY, 2007 Phosphorylation of Sld2 and Sld3 by cyclin-dependent
kinases promotes DNA replication in budding yeast. Nature 445: 281-285.
ZHONG, W., H. FENG, F. E. SANTIAGO and E. T. KIPREOS, 2003 CUL-4 ubiquitin ligase maintains
genome stability by restraining DNA-replication licensing. Nature 423: 885-889.
Chapter Two:
Genome-wide identification of
Drosophila follicle cell amplicons as in vivo model replicons
Jane C. Kim, Jared Nordman, Fang Xie, Helena Kashevsky, Thomas Eng, and
Terry L. Orr-Weaver
Whitehead Institute and Dept. of Biology, Massachusetts Institute of Technology
Cambridge, MA 02142
J.N. performed the 16C follicle cell RNA-seq experiment and contributed to data analysis.
F.X. performed and analyzed histone modification ChIP-qPCR and tethering experiments.
H.K. performed the aCGH experiment of OrRTOW flies.
T.E. performed the aCGH experiment of OrRMOD flies.
J.K. performed all other experiments and data analysis.
ABSTRACT
We report the genome-wide identification of all follicle cell amplicons in Drosophila, uncovering
two new amplified regions. We determined the precise localization of the origin recognition
complex (ORC) on a genome-wide level and observed that, at the start of synchronous
amplification, ORC localizes to the six amplicons with levels corresponding to the magnitude of
amplification. Additionally, we investigated amplification with respect to transcription and
chromatin state. The levels and timing of gene expression in some amplicons suggest that gene
amplification is not exclusively a developmental strategy to promote high expression levels.
Follicle cell amplicons are enriched for tetra-acetylated H4, but this mark is not sufficient for
ORC localization or amplification. In addition to genome-wide analyses, we investigated the
replication properties of one new amplicon in finer molecular detail. Strikingly, DAFC-22B
shows strain-specificity in amplification, a property that is correlated with the ability to localize
ORC. We identified sequence differences between closely related amplifying and nonamplifying strains and used P element mediated transformation to test sufficiency for ORC
binding and amplification at this region.
INTRODUCTION
The initiation of DNA replication occurs from discrete genomic regions called replication
origins. Although the protein complexes that bind DNA and license origins for replication are
known and well conserved, the properties that enable a DNA sequence to function as a
replication origin in metazoans are more poorly delineated than in simpler eukaryotes (CVETIC
and WALTER 2005; GILBERT 2004). One reason is that analysis of metazoan replication origins
has, until recently, proceeded from molecular characterization of a small number of identified
origins (ALADJEM 2007). For example, the DHFR locus in Chinese hamster ovary cells is a
valuable model replication origin and has been analyzed using a variety of methods; replication
at this locus initiates in a broad initiation zone containing multiple inefficient initiation sites
(HAMLIN et al. 2010). Recent genome-wide origin mapping studies in Drosophila, mouse, and
human cell culture have greatly increased the number of identified metazoan origins (CADORET
et al. 2008; KARNANI et al. 2010; MACALPINE et al. 2010; SEQUEIRA-MENDES et al. 2009),
revealing that most origins coincide with active transcription units, specifically the transcription
start site, and marks for open chromatin as well as confirming studies from model replication
origins that there is no sequence specific motif for ORC binding or origin specification in
metazoans.
Gene amplification that is developmentally regulated is an important origin discovery
tool and model for investigating the regulation of metazoan replication origins in vivo
(CLAYCOMB and ORR-WEAVER 2005). Specific genomic regions are amplified through
replication-based mechanisms, either chromosomal excision followed by extra-chromosomal
amplification or repeated bidirectional replication from an endogenous chromosomal locus, to
increase gene copy number. Developmental gene amplification increases the DNA template to
allow for sufficient levels of gene products required at high levels, such as rRNA in frog ooctyes
and cocoon proteins in Sciarid fly salivary glands. In Drosophila two chorion gene clusters are
amplified in ovarian follicle cells, somatic epithelial cells that surround the ooctye and secrete
the components of the eggshell, by repeated origin activation at the endogenous locus (OSHEIM et
al. 1988; SPRADLING 1981). This process enables the eggshell proteins to be produced in a short
developmental period.
Several features make Drosophila follicle cell gene amplification a powerful model for
investigating metazoan origin function. First, the process occurs within the context of developing
egg chambers that are morphologically distinct and can be isolated for experimental analysis,
allowing replication events to be studied in the context of development. Second, because gene
amplification begins after genomic replication has shut off in development, methods to assess
DNA replication including quantitative PCR or immunofluorescence of the nucleotide analog
bromodeoxyuridine (BrdU) to visualize newly replicated DNA can be used to assess the precise
timing of replication events. The third chromosome chorion amplicon displays an early phase
when replication initiation occurs, resulting in approximate 60-fold amplification of the locus. At
subsequent stages, there are no further initiation events and only elongation of existing
replication forks (CLAYCOMB et al. 2002). Finally, Drosophila genetic tools and manipulation
allow one to test cis requirements for gene amplification. Unlike the DHFR locus, the third
chromosome chorion amplicon is an example of a confined site-specific replication origin. A
discrete 320 base pair amplification control element, ACE3, and non-contiguous 884 base pair
origin, Orip, are sufficient for amplification (Lu et al. 2001).
Because the chorion amplicons have served as important replication models, Claycomb et
al undertook a microarray strategy to identify additional follicle cell amplicons (CLAYCOMB et
al. 2004). Two were identified and designated Drosophila Amplicon in Follicle Cells (DAFC)
followed by the cytological position, DAFC-30B and DAFC-62D. Recent analysis of DAFC-62D
has proven it to be a unique replication model. Unlike DAFC-66D, DAFC-62D displays two
stages of replication initiation with a period of elongation in between. The second round of
replication initiation is dependent on RNAPII transcription for localization of the MCM2-7
helicase complex (XIE and ORR-WEAVER 2008). In contrast, ORC localizes to DAFC-62D
throughout all amplification stages and is unaffected by transcription inhibition. These studies
exemplify how identification and molecular characterization of follicle cell amplicons can
uncover new regulatory mechanisms of DNA replication. To identify additional amplification
origins to be used as model replicons, we expanded our follicle cell amplicon discovery
approach, as the cDNA microarrays used in the 2004 study contained fewer than 50% of
Drosophila genes and no intergenic regions.
Here we report the identification of all follicle cell amplicons using high-density
microarrays, uncovering two new amplicons to make a total of six follicle cell amplicons. With a
complete catalog of the follicle cell amplicons, we investigate how gene amplification is related
to or possibly influenced by aspects of the genomic landscape such as transcription and histone
modifications as an in vivo complement to genome-wide analyses of metazoan replication origins
in cell culture. In addition, we investigate the determinants of ORC binding at one amplification
origin by exploiting its property of strain-specific amplification.
RESULTS
Identification of two new follicle cell amplicons by aCGH
To identify all of the amplified regions in follicle cells, we employed an array-based
comparative genomic hybridization (aCGH) strategy. Before follicle cells undergo gene
amplification, they undergo three rounds of endoreduplication, or chromosomal replication
without intervening mitoses, to reach 16C copy levels. We isolated pure populations of follicle
cell nuclei enriched for gene amplification stages by performing flow cytometry to collect 16C
nuclei. DNA from these collections was competitively hybridized with diploid embryonic DNA
on genome-wide tiling microarrays containing one 60-mer probe approximately every 600 bp.
Because of this high-density coverage, we are confident that we have identified all of the
amplicons present in these cells. Using this approach, we confirmed the presence of the four
previously known follicle cell amplicons and identified two new follicle cell amplicons (Figure
2-1A).
Like the four previously identified amplicons, DAFC-22B and DAFC-34B show a
gradient of replicated DNA that spans approximately 100 kb (Figures 2-1B and 2-1C). The
maximum aCGH enrichment ratios for DAFC-22B and DAFC-34B are less than those of the
chorion amplicons but comparable to DAFC-30B and DAFC-62D (Table 2-1). As expected, the
genomic regions in DAFC-22B and DAFC-34B contain genes. The 20 genes within the amplified
region of DAFC-34B are generally less than 5 kb in length, and one gene, Vm34Ca, located in
the peak of amplification encodes a structural component of the vitelline membrane, the
innermost layer underlying the Drosophila eggshell. Surprisingly, there is one 60 kb gene,
CG7337, in the most amplified region of DAFC-22B, thus differing from the genomic
organization of the other characterized follicle cell amplicons, which contain small genes
encoding eggshell proteins and enzymes that are approximately 1 kb in length.
Genome-wide expression analysis of follicle cells
Gene amplification is typically considered a strategy to augment gene expression to high
levels. Female-sterile mutants in replication factors such as ORC2, MCM6, and DBF4 lay eggs
with a thin eggshell phenotype attributed to the inadequate transcription of eggshell genes
Figure 2-1. Genome-wide identification of Drosophila follicle cell amplicons by aCGH
identifies two new amplified regions.
16C follicle cells were competitively hybridized with diploid embryonic DNA to microarrays
with approximately one probe every 600 bp. The Y-axis represents the Log 2 ratio of 16C follicle
cell DNA compared to diploid embryonic DNA. Entire chromosome arms are shown in (A). The
newly identified amplicons DAFC-22B and DAFC-34B are marked in red. (B) Close up 150 kb
view of DAFC-22B. CG7337 spans the entire length of the amplified region. (C) Close up 150 kb
view of DAFC-34B. Vm34Ca encodes a structural component of the vitelline membrane.
...........
...........................................................
...........
...
.....................
- A.........
..................
.......
Figure 2-1
A
Chromosome X
10 MbI
200000001
150000001
100000001
50000001
DAFC-7F
Chr omosome 2R
Chromosome 2L
10 f0
50000001
200000001
150000001
100000001
50000001
100000001
200000001
15DDDO0I
5-
5-
0)
0-
01
-
0
I
,
DAFC-30B DAFC-34B
Da
-A
Chromosome 3R
Chromosome 3L
Ot
150000001
100000001
50OW01o
I
-
10
00.
10
100000001
500ODO0
200000001
150000001
200000001
250000001
CD 0
4
DAFC-66D
DAFC-62D
Chromosome 4
2000001
0
00000
400O00 I
60000
I
000000o1
1
100000
12000001
05-
00-
p
B
DAFC-34B
DAFC-22B
50
50kb
18500001
134000001
133500001
19000001
134500001
1950000,
30)
0
0
-
"-'~~~"
~~~ ,..amigll
..llllihi.....-- - .---
0
IE||ill IEllu is..n..... ....
CD0-
umililil
~
os~C~l
w3
C1
00
Tm423
CG
caiM
180
caieo
CG733
670@4
26O0Z CG10858U
bOWtOP0
TehaoI"
491-3
CG1082
CG16850
4'
31
.
Table 2-1. Drosophila Amplicons in Follicle Cells.
*
Genomic position
CGH max (Log 2)
ORC2 max (Log2)
acH4 max (Log 2 )*
DAFC-22B
1.5503
1.8736
1.1593
DAFC-30B
1.5354
2.1616
1.4092
DAFC-34B
2.338
2.5568
1.4569
DAFC-62D
1.4083
1.9345
1.3374
DAFC-66D
4.9087
4.3749
3.6094
DAFC-7F
3.6786
2.4783
2.33
maximum tetra-acetylated H4 ChIP value that corresponds to ORC binding zone
(LANDIS et al. 1997; LANDIS and TOWER 1999; SCHWED et al. 2002). To determine the
transcriptional profiles of the genes in the amplified regions, we performed high throughput
RNA sequencing. We isolated RNA from 16C follicle cells recovered by FACS and performed
Illumina sequencing to uncover a global view of transcript levels in amplification stage follicle
cells. 16C cells will include follicle cells from stage 9 egg chambers, when the cells exit the
endocycles, to stage 14, immediately before egg deposition.
Transcript profiles for DAFC-66D and DAFC-7F show high expression levels of the
chorion genes in these regions, confirming that our RNA-seq results are an accurate reflection of
amplification stage follicle cells (Figures 2-2E and 2-2F). For example, at DAFC-66D, cp18,
cp15, cp19, and cp16 are all highly expressed. Although these four genes are located in the same
10 kb most amplified region, cp]6 shows at least six-fold less expression compared to the other
three by reads per kilobase of exon model per million mapped (RPKM) values. These results
correspond precisely with gene expression studies using developmental Northern blots and in
situ hybridization experiments, thus validating the accuracy of RNA-seq quantification (GRIFFINSHEA et al. 1982). The same is true of the genes in the most amplified region of DAFC-7F; they
are highly expressed but exhibit variable levels consistent with Northern blot analysis (PARKS et
al. 1986). Other genes within the 100 kb amplified gradients of DAFC-66D or DAFC-7F were
expressed at low levels. Thus, being located within an amplicon will not, by default, activate
transcription nor will it, for expressed genes, enhance expression to uniform levels. Although
gene amplification may promote high expression levels of some genes, there are additional
regulatory mechanisms that fine-tune the levels, developmental timing, and spatial specificity of
gene expression.
Figure 2-2. Follicle cell expression levels vary among the genes located in the six amplified
regions.
200 kb regions of all follicle cell amplicons (A-F) with sequence reads from RNA-seq above and
aCGH data below. Maximum sequence read is 100 for DAFC-22B, DAFC-30B, and DAFC-62D.
Maximum sequence read is 1000 for DAFC-34B, DAFC-66D, and DAFC- 7F.
...
..
....
....
.. ......
..
........
..............
..........
........
..
..
..
..
..........
- - - - _.
...............
7 111111
111111771111
........
.......
.......
Figure 2-2
DAFC-22B
DAFC-30B
100kW.
185ODO0I
19000001
19500001
2000001
95000001
95500001
9000001
LA
-I....in i
L.
-..
_ _ _ ..
a
003161-4
C03165
0"3W'
UI
* "I00I
001358
07337,
CG7337
(7337
0108083 NP0C2100C3
3
0015
0078a
0033
T8-2!!Q
I
0
~ 438
c1,71855 j"T
4~~
U
ip
0031870
00438
0018823 3113
DAFC-34B
13400001
TxH
0
1
Tx
0180
...
31T 4
DAFC-62D
100
133500001
m
*G86.rc.
134500001
135000001
2
00
225DD0D1
22000001
100
23000001
000
4)
c)
a)
5-
5-
0366
18426
008077W JA030
00306
p24
T11P 00848G02731388
Hp03
0000.
00853
4
'4
I
841 FM5707
6
00158786
oe-IN
0312755
G090186 00127564
ACX
P.D
8ogl
.0
F .=
DAFC-66D
DAFC-7F
lw.-.
87000001
OGIS 01370
0013801as
wFOL2
D01-2~
C131
001380
0020041
OG138071
00138088
C0323021
weiow-gli1
007088
0071103beOaWCopE
88500001
s-2'.W. Dms3-i
87500I
88000001
8300001
I
8400000I
800o001
Adsoonal
(E
a)il
I=
pll.#Sl #r
OO416
h8
00,4
~32022Poo
CaW3~~
Ca
028
0111
001100
OTr121
0012135
00140
C~p7F83
003
OpFI
00MI
Prrn 003308
I6
pm
0G,35
C06448
Ota C1530314*
0T1.1
-----
00318 e...44
00331811216
.o.m4900280-
121113
1(54
001265
1265.f
0012065em
2.-
The other four amplicons show distinct transcript profiles. Like the chorion amplicons,
DAFC-34B shows high expression levels for at least two genes in the amplified region (Figure 22C). The vitelline membrane gene, Vm34Ca, had the highest RPKM value in our RNA-seq
dataset, though unlike the chorion amplicons, there is not a cluster of highly expressed genes in
the 10 kb most amplified region of DAFC-34B. In contrast, the transcript profiles of DAFC-30B
and DAFC-62D do not fit the simple model that amplification is a developmental strategy to
promote very high expression levels, as these amplicons contain genes that are only moderately
expressed (Figures 2-2B and 2-2D). Claycomb et al showed that genes within DAFC-30B and
DAFC-62D such as CG13113 and yellow-g2, respectively, show high expression levels across all
follicle cells in only one or two stages of egg chamber development (CLAYCOMB et al. 2004),
unlike the chorion genes, which show more sustained high expression levels throughout several
stages (PARKS and SPRADLING 1987). These genes show reduced expression in amplification
mutants by in situ hybridization, indicating the importance of amplification for normal
transcription levels (CLAYCOMB et al. 2004). Thus, the lower RPKM values of these genes may
be a reflection of the very narrow developmental window in which they are expressed.
Furthermore, DAFC-30B contains genes at the edge of the amplified gradient that are more
highly expressed than genes in the central region, a unique feature among the six amplicons.
For DAFC-22B, we found that CG7337 is expressed at low levels (Figure 2-2A). While it
is possible that gene amplification is necessary for even these low levels of transcription to be
met, equivalent gene expression of CG7337 in a background that does not amplify the locus
(discussed below) makes this an unlikely explanation. Another possibility is that amplification of
this region does not augment gene expression but may be the indirect consequence of
transcription or active chromatin marks at this region that result in replication initiation.
Figure 2-3. Gene amplification is not required for high follicle cell gene expression.
200 kb regions of six non-amplified genomic loci (A-F) with highly expressed genes. Sequence
reads from RNA-seq are above, and aCGH data is below. 26A and 32E contain vitelline
membrane genes homologous to Vm34Ca.
.
.
. ........
.......................................................
Figure 2-3
100l
1000_
14500001
140000I
13500001
13000001
5-
5-
78500001
- --j-
cc
to
78000001
77500001
77000001
8
0Sta
1470
hatsch
CG1402
CG14796=
CG149
14 sta CG32810m
41
C140
CG1C1532
I
g
,0
CG1
a
G1
00178.3
814af 06153323 snb- n
1514M ,
5
1531
10777
F0, C
15325
G
10 C~oiSOcS
am
k1)61551
21841 O 1)
dcI. 0624
som
CGRpS146
dpr4
0C 10
26A
inn0
100001
99500001
0000001
---1
o n1 samw
ma6smmM"n
CG32804
go m..
Ka
mma.I""--..~~
Mmi
C
G11
66n
061814
0
IMMtN.Mlmmas
CG12645
C629723
6,m"
0000001
600500001
10058WW
-~*"
- ---- U L----
45
:
5
500000
1000
.a.oa
06 O12645
062974
1512
90
15220
32E
30E
8500001
- 000
-
001115000
111000001
og500001
-
1125ODDDI
112000001
1000_
~~~.1
CO
5-
5-
-
0-
.
N
.
.....
.
...
-.....
...
8
gopf~~21ij
0 43 m
L-11,6613A24
23.24
0642360
06(13124m
:
....
C-
m".
,a="Ir-e.
g...........
I C013126.4
.
.
..
..
.....
bGb133
RpL13C
C0131301
4 016754
c084k
0
mor
-n
monn.U
1C
CG.05
C
ia-rn
-- *--"-nLE
1
049
..-
-
E.......-
hgoM
C1
C
Ca-M
1 Ca-belam
___~ 047886
CIA&
0471152C3
014
801.6
---
a
60)0
a-
If active transcription promotes specific chromatin modifications or other structural
changes that permit gene amplification to occur in follicle cells, we would expect gene regions
with highly expressed genes to become amplified. However, this does not appear to be the case,
as there are highly expressed genes and gene clusters that are not amplified (Figure 2-3). This
observation demonstrates that gene amplification is not the exclusive means by which genes
become highly expressed in follicle cells.
ORC binding in amplicons localizes to the most amplified region
ORC function is essential for DNA replication initiation during genomic replication as
well as gene amplification. ORC localization marks all potential sites of replication initiation in
the genome, and in Drosophila, the chorion amplicon DAFC-66D was the first example of ORC
binding at a replication origin in vivo (AUSTIN et al. 1999). Localization of ORC by chromatin
immunoprecipitation (ChIP) showed enrichment at defined replication control elements A CE3
and Orip for DAFC-66D. In addition, ORC was found to co-localize with amplification foci by
immunofluorescence. However, apart from DAFC-62D, for which a 20 kb region was assessed
for ORC binding by ChIP-qPCR, the precise localization of ORC for the other amplicons was
unknown. Therefore, to determine the localization of ORC with respect to the 100 kb amplified
gradients as well as to see whether other genomic regions, especially those showing high
expression levels but no amplification, were bound by ORC, we performed a genome-wide
assessment of ORC localization.
We isolated stage 10 egg chambers by hand, when synchronous follicle cell gene
amplification commences, and performed chromatin immunoprecipitation using a polyclonal
antibody specific for ORC2 followed by hybridization to a high-density tiling microarray (ChIPchip). The starting number of egg chambers (approximately 1200) produced enough material so
62
that no amplification had to be performed on the samples. We found that ORC localizes to all
follicle cell amplicons in a zone centered at the peak of amplification, ranging from
approximately 10 to 30 kb depending on the amplicon. In addition, we found that levels of ORC
enrichment generally corresponded to the magnitude of gene amplification (Figure 2-4 and Table
1). The log ratios of ORC binding were highest for the two chorion amplicons and DAFC-34B,
which also show the greatest magnitude of amplification among the six amplicons. These results
are consistent with previous ChIP-qPCR quantification comparing DAFC-62D and DAFC-66D
(XIE and ORR-WEAVER 2008). As the status of ORC binding for DAFC-7F and DAFC-30B were
previously unknown, these studies contribute four new ORC bound amplification origins to the
catalog of metazoan replication origins.
As noted above, many genes and gene clusters are highly expressed in follicle cells
despite not being amplified. We examined ORC binding at several of these highly expressed
regions to see if the absence of ORC binding could explain why these regions are not amplified
(Figure 2-5). Apart from a peak of ORC binding at the dec-1 promoter at 7C, we did not observe
ORC localization at these highly expressed regions. We computationally identified the top
scoring regions of ORC enrichment and found that, while the amplicons were among the top
scoring regions, ORC localized to other discrete genomic regions that were not amplified (Figure
2-6). Although ORC binding is necessary for gene amplification, it is not predictive of
replication initiation at a specific site. Other regulatory mechanisms are likely necessary to
establish the full pre-replicative complex (pre-RC) and activate replication initiation.
Alternatively, ORC localization at these non-amplifying sites may play a role independent of
replication initiation. ORC is required for mating type silencing in budding yeast, maintenance of
heterochromatic silencing in Drosophila via interaction with HP1/Su(Var)205, and establishment
Figure 2-4. The magnitude of gene amplification corresponds to the levels of ORC binding and
tetra-acetylated H4
200 kb regions of all follicle cell amplicons (A-F) with aCGH data, ORC2 ChIP-chip, and tetraacetylated H4 ChIP-chip, all in log2 ratios. The chorion amplicons (E,F) show the highest levels
of increased DNA copy number, indicating repeated rounds of origin firing, which correspond to
high levels of ORC and tetra-acetylated H4 enrichment. Y-axis of ChIP-chip data shows Log 2
ratio of immunoprecipitated DNA to stage 10 input DNA.
.
.....................
.
.........
......................
....
..... ..............
Figure 2-4
DAFC-30B
DAFC-22B
100
100 1
I100
1850001
0 -
-*
-
0 k0
900O000
uow I
0000000
i19-0001
"........
5-
5-
C.)
ci
0
4(L 4
~iL
s W
,Lp
-
0
H
~
i 10111114U--.
Noma
2
1 "*"1-o1
DAFC-62D
DAFC-34B
100
kb:
134000001
2300001
30000
22000DI
22DDODDI
135000001
-Mod" " "
-"
ffiooo
iiiU
0
L
A
4-
4-
0.&,au&L Ll
REI
I
01
go..
4F-
*
on I-I-11-4
in~~min~am~m~
I
54.
364
4
DAFC-7F
DAFC-66D
o50o0l
87500001
87000001
*I5
1
88000001
83500001
400000
Imamam.
10.EIw"III.I1uII
a-
w
4-
00
se
iprqu
o
r
I
£i
1~1A
U.l
6jk.-- IFa1~
W-wp1qqr , 4.
1-44101
"M
*
k
0
i
4-
4.
-
1-5* MII
31144*
.4-
NOR4
I
.iin.inh~,
o.~T~y*r
U
*~U.4
Bh~Aa.jU
5146
kL~.I~aia..U.ad..
111.4
WS4.5*4U.-41.--5*4---5*4
mm a-
Figure 2-5. ORC is not enriched at most highly expressed gene regions.
200 kb regions exhibiting high follicle cell expression from Figure 2-3 (A-F) with aCGH data,
ORC2 ChIP-chip, and tetra-acetylated H4 ChIP-chip, all in log2 ratios. The maximum log2
enrichment of ORC2 at 7C is 1.5935.
...
.....
..
......
...
- - -=Z..............
-....................
...........
...........
..
. .:..m:
:m
..........................
-- ----------
..............................
Figure 2-5
100
-
1300I0
I
13500001
0 bin.6alnEUEiin1amm
uumaem
-m.1min..urnm
100
77000001
14500001
14000001
sam..emle.s~me
77500001
imEaml
No.
sml
78500001
78000001
l..uuin.'mimiLana
5-
-
9A
26A
100
9000001
90500001
7UU &*WWIool
10000000|
-. m..inmum.iiiu~....~uuiIm
oI.inin.,11,E
a
I ininlI
uinrnlh
,.~,.
0500001
.-...-..
0000000|
0060000|
maammamammmmism....-....................
5-_
aC'4
0
A.b~JLAA..~
.AkoP~XI0
wr.A~w
0
a.
4-
C
01
OW4
INS
0 -
H I
30E
0000001
32E
111500001
111000001
95000001
0 .
-......
...
,..-............m.mse
o
C)
C)4
IdJI
CL)4
I
0
-.-
I4~
M
ina arn-
s
N*4-4~~Io
~
112000001
m
ma
mamma,
s_
11250I
Figure 2-6. ORC binds to genomic regions that do not become amplified.
200 kb regions of non-amplified, top-scoring ORC2 enrichment values (A-F). aCGH data, ORC2
ChIP-chip, and tetra-acetylated H4 ChIP-chip are shown in log2 ratios.
i7
I -W
3814
tetH4ac ChIP
4s
JL
SEE-~
No
Ms'~
T Jj
tetH4ac ChIP
40
ORC2 ChIP
ORC2 ChIP
CGH
CGH
~.11
JO
tetH4ac ChIP
tetH4ac ChIP
ORC2 ChIP
ORC2 ChIP
CGH
CGH
1.
ri
Mu1
8 RE~U,8
tetH4ac ChIP
*I
tetH4ac ChIP
km
ORC2 ChIP
4
y
I
ORC2 ChIP
CGH
CGH
CD
-I
-Il
ci.
of cohesin loading in Xenopus egg extracts (BELL et al. 1993; HUANG et al. 1998; PAK et al.
1997; TAKAHASHI et al. 2004). Similarly, ORC may have roles in gene silencing or cohesin
establishment or maintenance at these non-amplifying regions.
H4 acetylation corresponds to the magnitude of gene amplification
DNA replication occurs in the context of chromatin, the combination of DNA and
associated histone proteins around which DNA wraps to form nucleosomes. The N-terminal tails
of histones are subject to a number of covalent modifications that either promote or inhibit
various genomic processes such as transcription, recombination, and replication (KOUZARIDES
2007). Two independent studies have found enrichment of hyperacetylated H3 and H4 at follicle
cell amplicons (AGGARWAL and CALVI 2004; HARTL et al. 2007). Loss-of-function mutant
clones of the histone deaectylase Rpd3 resulted in increased acetylation levels and showed
inappropriate genomic replication in amplification stage egg chambers. Furthermore, follicle cell
amplification using a reporter construct of DAFC-66D could be inhibited by tethering Rpd3 to
the region (AGGARWAL and CALVI 2004).
To investigate the relationship between gene amplification and histone acetylation on a
genome-wide scale, we performed ChIP on stage 10 egg chambers using an antibody against
tetra-acetylated H4, which recognizes acetylated lysines 5, 8, 12, and 16. Enrichment of tetraacetylated H4 is found at all six amplicons (Figure 2-4). Like ORC, the ratio of enrichment
generally correlates to the magnitude of amplification (Table 1). DAFC-66D shows the greatest
enrichment of tetra-acetylated H4 whereas DAFC-22B shows the smallest enrichment, though
still above background levels. Because this antibody recognizes all four acetylated lysines, we
used antibodies specific for single residues to test their correlation to gene amplification.
Whereas acetylated H4K5 and H4K12 antibodies showed only modest enrichment at the
amplicons by ChIP-qPCR, acetylated H4K8 was highly enriched around the amplification
origins, in a pattern resembling tetra-acetylated H4 distribution with levels correlated to
magnitude of amplification (Figure 2-7 and H4K12 data not shown).
To test whether acetylation levels, particularly that of H4K8, were necessary for
differences in amplification levels, we used the amplification reporter system developed by the
Calvi lab to quantitatively measure DNA copy number and enrichment of H4K8 (AGGARWAL
and CALvi 2004). The reporter designated TT1 contains the 3.8 kb minimal origin with ACE3
and Orip from DAFC-66D next to UAS, which binds GAL4 and GAL4 DNA binding domain
(DBD) fusion proteins. Following one-hour heat-shock induction of a GAL4DBD: :Rpd3 fusion,
amplification was completely abolished in stage 10 as well as pooled egg chambers of stages 11
and 12, without affecting the endogenous amplicons (Figure 2-8A). TT1 amplification levels
were measured by quantitative PCR using a probe specific to the transposon vector (to
distinguish between TT1 and endogenous DAFC-66D) compared to a non-amplified locus. We
examined stage 10 egg chambers for acetylated H4K8 after induction of the GAL4DBD::Rpd3
fusion and found that this histone mark was significantly reduced at TT1 but essentially
unchanged at the endogenous amplicons (Figure 2-8B). Thus, these results indicate that
acetylation of H4K8 is necessary for amplification of the TT1 transgene.
Recruitment of the histone acetyltransferase HATI has been reported to enhance
amplification using the same reporter assay. In contrast, we found that following one-hour heatshock induction of a GAL4DBD::HAT1 fusion, there was no effect on TT1 amplification in any
stage for three independent experiments (Figure 2-8C). The previous study measured gene
amplification by quantifying FISH signals and Southern blot experiments, so it possible that the
discrepancy is due to the greater sensitivity of quantitative PCR in determining DNA copy level
Figure 2-7. Levels of H4K8 acetylation correlate with amplification levels.
ChIP-qPCR analysis showing enrichment levels of tetra-acetylated H4 (A), acetylated H4K8 (B),
and acetylated H4K5 (C) across four amplified regions.
..
..................
....................
............................
..........
............
. .....--- ------------------...................
..
........
Figure 2-7
7
03 input
5 * tetra-AcH4
6
4
3
2
4
0
N'NI,
4:
-
lb
X$
T t'
'Pt
OV
'bi
14
Bg 12
10
U- 8
6
4
Zo2
0
N
4 "N'Z
4
+
Nb
*0
*
3.5
l~
3-
*0
2-
input
c
*AcH4K5
-2
c 1.5
-
0~
1411i
I I
.sZ Nlz N Nz'
Al lkl 'ZI
"o e ,x*x
lkl
Ili1ri
i Iti
1
b
Figure 2-8. Tethering of Rpd3 to TT1 represses its amplification and H4K8 acetylation whereas
tethering of HATI to TT1 does not affect its amplification.
(A) Stage-specific copy number of TT 1 in different genetic backgrounds and experimental
conditions. Arrow points to repressed amplification upon Rpd3 expression and tethering to TT1.
(B) Levels of H4K8 acetylation at TTl and representative probes from endogenous amplicons in
stage 10. Arrow points to reduction in acetylated H4K8 levels upon Rpd3 expression and
tethering to TT1. (C) TTl amplification is not affected by HATi expression and tethering to
TT1. (D) Levels of H4K8 acetylation in stage 10. "Reduction" in flies containing the hsp::HATl
transposon could be due to leaky expression of HAT1, causing genome-wide acetylation. (E)
Levels of H4K5 acetylation in stage 10. Arrow points to increased H4K5 acetylation upon HATI
expression and tethering to TT1.
.......
.............
......
......
..........
. .
Figure 2-8
16
14
12
'a
8
6
4
E
0
TT1/+
7F-0
(ACE1)
TT1/+
(heatshock)
30B-5
62D-O
(or62)
TT11+;
TT1/+;
hsp::Rpd3 hsp::Rpd3
(heatshock)
61
Genomic position
16
14
12
C 10
0
8
6
.4
2
-
0
TT1/+
TT1+
(heatshock)
TT11+;
TT1+;
hsp::HATI hsp::HAT1
(heatshock)
E 6.
10
S5.
x
4.
3 TT11+
* T7T1/+ (heatshock)
STT1/+; hsp::HA TI
STT11+; hsp::HAT1 (heatshock)
0
E
204
LIL
7F-0
(ACE1)
30B-5
62D-O
(or62)
61
Genomic position
0-
7F-4
(ACE1)
30B-5
62D-O
(or62)
66D-5
Genomic position
771
..............
..
.........
....
.
differences. We performed ChIP-qPCR experiments for acetylated H4K5 and acetylated H4K8
to assess the status of these chromatin marks when HATI is tethered to TT1. Flies carrying the
GAL4DBD::HAT1 fusion displayed lower levels of acetylated H4K8 enrichment at the
endogenous amplicons compared with TT1 alone, but they still show the same overall pattern of
acetylated H4K8 enrichment corresponding with amplification levels (Figure 2-8D). There was
no effect of HATI recruitment on acetylated H4K8 levels at TT1 upon HATI tethering. In
contrast, acetylated H4K5 is found at very low levels except at TT1 when HATI is induced and
acetylated H4K5 is significantly enhanced (Figure 2-8E). HATI has been reported to catalyze
acetylation at H4K5 and H4K12. Thus, the absence of H4K8 hyperacetylation may explain why
enhanced amplification was not observed with HATI tethering.
DAFC-22B exhibits strain-specific amplification
To characterize further the amplification properties of DAFC-22B, we examined gene
amplification at this locus in a number of genetic backgrounds. Surprisingly, we found that
DAFC-22B displays strain-specific amplification. By aCGH, we found that even in two closely
related "wild type" strains where all five other amplicons are common, DAFC-22B amplifies in
OrR T
and is not amplified in OrRMOD (Figure 2-9). To determine the stage-specific replication
profile of DAFC-22B, we hand sorted OrRTOW egg chambers to isolate genomic DNA and
performed qPCR quantification of DNA copy levels. We found that replication initiation
increased from stages 1OB through 13, resulting in approximately four-fold amplification by
stage 13 (Figure 2-10). In addition, we performed ChIP-chip with ORC2 on the strain that does
not amplify DAFC-22B and found that ORC localization is absent at DAFC-22B despite the
same ORC localization for the other five amplicons (Figure 2-9). Because the determinants of
ORC binding are poorly understood in metazoans, uncovering the difference between the two
Figure 2-9. DAFC-22B exhibits strain-specific amplification that correlates with the ability of
the region to bind ORC.
DAFC-22B is not amplified in the OrRMOD strain, and this region does not bind ORC. Tetraacetylated H4 is observed in both amplifying and non-amplifying strains.
000
013
1c:
T
ORC2 ChIP
ORC2 ChIP
CGH
CGH
8w
AA
tetH4ac ChIP
-
tetH4ac ChIP
C
Figure 2-10. DAFC-22B initiates amplification at stage 1OB and increases copy number
approximately four-fold by stage 13.
Genomic DNA was isolated from hand sorted staged egg chambers (A-F), and DNA copy levels
were assessed by qPCR compared to a non-amplified locus.
Figure 2-10
DAFC-22B Stage 1OB
DAFC-22B Stage 1OA
1900000 1920000 1940000 1960000
1880000 1900000 1920000 1940000 1960000
Genomic Position
Genomic Position
DAFC-22B Stage 12
DAFC-22B Stage 11
rrols
1880000 1900000 1920000 1940000 196000
1880000 1900000 1920000 1940000 1960000
Genomic Position
Genomic Position
DAFC-22B
DAFC-228 Stage 13
+ Stage 1OA
Stage 1OB
+ Stage 11
7-+
6
+ Stage 12
-
1880000 1900000 1920000 1940000 1960000
Genomic Position
Stage 13
1880000 1900000 1920000 1940000 1960000
Genomic Position
strains that is responsible for the difference in amplification will be a powerful model for
understanding how ORC binds to specific DNA regions to promote replication initiation.
Recent genome-wide studies from Drosophila to human cell culture have shown that the
majority of replication origins correspond to gene regions and specifically the transcription start
sites of genes (MAcALPINE et al. 2010). Because DAFC-22B coincides with a gene and one of its
isoforms has the 5' end located in the peak of amplification, we examined stage-specific gene
expression of CG7337 in both strains for possible differences that might explain the presence or
absence of amplification at the region. To our surprise, we found that there was no difference in
overall CG7337 expression levels between the amplifying and non-amplifying strain for any
developmental stage of dissected egg chambers (Figure 2-11). Additionally, we used probes that
could distinguish between the A,E-isoforms and D-isoform. Whereas the A,E-isoforms showed
equivalent levels between the two strains, the D-isoform was more highly expressed in the nonamplifying OrRMO strain. Because whole egg chambers contain nurse cells and the oocyte in
addition to follicle cells, we isolated purified follicle cell RNA from both whole ovaries and
stage 13 egg chambers but detected no difference in expression levels of the D-isoform.
Intriguingly, there are differences in CG7337 expression between the two strains but they do not
appear to arise from the follicle cells.
We performed ChIP to examine H4 tetra-acetylation at DAFC-22B and observed
enrichment in the non-amplifying OrRMO strain (Figure 2-9). We observed the same result by
acetylated H4K8 ChIP-qPCR (Figure 2-12). Thus, although levels of H4 acetylation correspond
to the magnitude of gene amplification, H4 acetylation does not make chromatin sufficient for
replication initiation.
Figure 2-11. CG7337 is expressed at similar levels in follicle cells of amplifying and nonamplifying strains.
RNA was isolated from hand sorted staged egg chambers. cDNA was quantified using a probe
recognizing all five isoforms, the A and E isoforms, or D isoform (two unique probes). To
investigate the difference in D isoform levels, purified follicle cells were isolated from stage 13
egg chambers and whole ovaries.
....................
Figure 2-11
All CG7337 Isoforms
CG7337 AE Isoforms
] MOD
MOD
[]
m
M TOW
r,
I
~'-
l
r"" -
C>~\
C,
C>
-mSM
:.b
Stages
Stages
CG7337 D Isoform (Probe 1)
CG7337 D |soform (Probe 2)
5n
M
TOW
MOD
D3M0D
- TOW
TOw
10-
5.r"
s
'maim
.&
Stages
Stages
Stage 13
Stage 13 Follide Cells
4-
4
DOMW
O: MOD
3-
*
W TOW
Tow
2-
2o
o.L.
004
Isoform
Isofom
Embryo
Total Follicle Cells
4-
4-
E
*
3-
Isoform
MOW
TOW
MOD
TOW
3
boform
Figure 2-12. H4K8 acetylation levels are equivalent between the amplifying and non-amplifying
strains by ChIP-qPCR.
Acetylated H4K8 ChIP was performed on both OrRMOD and OrRT0 strains and quantified by
qPCR. No difference in enrichment levels was detected between the two strains.
..
...
........
....
..
-- -
................
---------------
Figure 2-12
acH4K8 ChIP-qPCR
127
10864-
II
nm I1j
A4
I I\r'0100
A
DAFC-22B
, 0 ,<,o
DFt
DAFC-62D
DAFC-66D
I
DAFC-34B
Mapping cis elements responsible for DAFC-22B amplification
To address whether the difference in gene amplification between OrRTOW and OrRMOD
could be explained by differential activity of a trans acting factor, we tested whether DAFC-22B
amplification in OrRTow segregated with the second chromosome on which it is located. This test
is only definitive if the trans acting factor is not on this chromosome, but consistent with the
difference being a cis effect, amplification was found to segregate with the second chromosome
(data not shown).
In Drosophila cis requirements for amplification can be determined using P element
mediated transformation to examine amplification of a test sequence at an ectopic site. We
generated transposons containing the 10 kb ORC binding zone from OrRTOW as well as the
equivalent region from OrRMOD by PCR (Figure 2-13). In addition, DAFC-22B is the only
amplicon that also shows ORC binding in cell culture, though the ORC binding regions do not
appear to be overlapping between egg chambers and cell culture (Figure 2-13). Thus, we also
amplified the 10 kb sequence containing the cell culture ORC binding sites from OrRTow to test
for amplification. We flanked the 22B sequences with Suppressor of Hairy-wing binding sites to
minimize position effect variability (Lu and TOWER 1997). Sequencing these 22B regions
revealed no major rearrangements but a number of 20-30 bp insertion/deletion differences
between the two strains (data not shown). Analysis of these transgenic flies will provide an
important example of the determinants ORC binding in a metazoan system.
DISCUSSION
We have used Drosophila follicle cell gene amplification as a model system to study
metazoan replication origins: identifying all follicle cell amplicons, performing genome-wide
analyses to look for general relationships of amplification to gene expression and histone
Figure 2-13. Genetic analysis of cis control elements for differential DAFC-22B amplification
The DAFC-22B regions for P element mediated transformation and testing sufficiency for
amplification are shown in black bars. Region A was PCR-amplified from OrRTOw (amplifying)
and OrRMOD (non-amplifying) flies. Region B was amplified from OrR TOW flies. CGH data and
ORC2 ChIP-chip data are shown for reference (scale is Log 2 ratio). ORC2 ChIP-seq data from
KC cells is also shown (scale is sequence tag density).
Figure 2-13
I
50 kbi
18900001
19000001
19100001
19300001
19400001
19500001
19600001
""
.
3-
iilIL.
0*
cc
00
19200001
I"
". """".
0-.
.
18800001
Il
0B
0&
..L&A.
.
iE-i~*i~
J&
.~.k
AAAA,A.'_ fkAr_
0
0
CG153581H
CG7337
CG7337'
CG7337
CG7337
lot- *4w
C:'G7337
CG33673 6
+
CG31 670 H'.-*
40%.
acetylation, as well as investigating these relationships in greater detail at the level of individual
origins. The molecular and genetic tools available in Drosophila permit incredible dexterity in
moving between genome-wide and individual origin analyses, which will likely be more
important as the hypotheses generated from genome-wide studies continue to increase. Thus,
these ORC-bound Drosophila follicle cell amplicons serve as powerful experimental models for
in vivo analysis of replication origin properties and regulatory mechanisms.
In the context of gene amplification, the relationship between replication and
transcription has two facets: how replication and increased DNA copy number affects
transcription as well as how transcription affects replication. The examples from follicle cell
gene amplification reveal that both display complex relationships. First, the identification of all
the amplified regions in Drosophila combined with genome-wide assessment of transcript levels
reveals that the simple model of gene amplification being a developmental strategy to promote
high levels of gene expression, as seems to be the case for the two chorion amplicons, may not
be an absolute one. When examined by RNA-seq of 16C follicle cells, the genes in DAFC-30B
and DAFC-62D are expressed to moderate levels. However, by in situ hybridization, these genes
are highly expressed in one or two stages of egg chamber development, and their levels require
the function of replication proteins (CLAYCOMB et al. 2004). Thus, even a four-fold increase in
DNA may be important for providing sufficient expression levels of these genes. Notably, high
levels of expression do not necessitate gene amplification to reach these large amounts as many
genes show abundant expression without being amplified.
The expression of genes in DAFC-22B and DAFC-34B presents an enigma in explaining
how DNA copy number affects transcription. At DAFC-22B, CG7337 is expressed to the same
levels regardless of whether the region is amplified. Whether this outcome is due to CG7337 not
requiring amplification for expression is unknown. An alternative possibility is that the strains
that do not amplify DAFC-22B have a more efficient or active promoter than the strains that do
amplify the region. However, given that two of the strains that do and do not amplify DAFC-22B
are closely related OrR strains, we favor the explanation that gene amplification is not required
for expression of CG733 7. At DAFC-34B, Vm34Ca is present at high levels in our RNA-seq
data, but this gene begins to be expressed in stage 8 (MINDRINOS et al. 1985). This is the first
example of a gene in a follicle cell amplicon that is expressed before synchronous amplification
begins and raises the question of whether gene amplification is necessary for Vm34Ca
expression. Another possibility is that high levels of Vm34Ca promote gene amplification at this
locus. Furthermore, there are at least three other homologous vitelline membrane genes in the
Drosophila genome, two of which are in a cluster at cytological position 26A containing several
genes expressed in follicle cells, yet none of these genes are amplified.
The effect of transcription on replication initiation is not well delineated, and there are
many diverse examples even among amplification origins. The chorion amplicon DAFC-66D is
regulated by the E2F, MYB, and RB transcription factor complexes (BEALL et al. 2002; Bosco
et al. 2001). However, as ACE3 and Orif#are sufficient for amplification in the absence of the
cp18 transcription unit normally between the two elements, active transcription is not necessary
for amplification (Lu et al. 2001). In contrast, at DAFC-62D, transcription is required for MCM
complexes to be loaded at the second stage of initiation (XIE and ORR-WEAVER 2008). Gene
amplification in Sciara coprophila offers an example of the inhibitory effect of transcription on
replication initiation. In the salivary gland, puff II 9A is amplified to promote abundant
expression of genes encoding cocoon proteins. When the locus becomes transcriptionally active,
the replication initiation zone becomes constricted from a 7-8 kb region encompassing two
transcription units to a 2 kb region that does not coincide with any genes (LUNYAK et al. 2002).
The low expression levels of CG7337 and unique timing of Vm34Ca raise the possibility
that amplification at these regions is a consequence of gene expression and may not have a direct
role in regulating gene expression. One of the most reproducible findings among recent genomewide origin mapping studies was the significant number of origins that corresponded to gene
regions and in particular, the transcription start sites of active genes. As both transcription and
replication initiation require open chromatin, these two processes may be functionally linked
with respect to the regulatory mechanisms controlling initiation; studies in several metazoan
systems show that early replicating regions correspond to actively transcribed zones
(MACALPINE
et al. 2004; WHITE et al. 2004). A high resolution study of replication timing in
multiple cell lines using next generation sequencing methods reported that genes expressed
solely in one cell type were early-replicating exclusively in that cell type, suggesting a causal
effect of transcription on early replication (HANSEN et al. 2010). Although there are just six
follicle cell amplicons to examine, compared to thousands of origins activated in the canonical S
phase, all six ORC binding amplification origins correspond to gene regions. This result is not
unexpected since amplification can be a strategy to augment gene expression. However, given
the relationship between replication origins corresponding to active promoters, the hundreds of
genes that are highly expressed in follicle cells but not amplified also provide powerful models
to investigate what properties, in addition to transcription, determine ORC binding and origin
activation. As 16C follicle cells encompass stage 9 through 14 egg chambers and spans over 20
hours, we believe many of these genes are actively expressed during amplification stages, though
RNA polymerase II localization on stage sorted follicle cell DNA would be necessary to assess
this directly.
We investigated the relationship between histone acetylation and gene amplification in
Drosophila follicle cells and found that acetylation of H4, and specifically H4K8, quantitatively
correlated with levels of gene amplification. The amplicons undergoing the most replication
initiation events displayed the highest levels of acetylation. Recent work in budding yeast
purifying histone proteins around a single origin and performing high-resolution mass
spectrometry to identify all histone modifications throughout the cell cycle has revealed dynamic
acetylation patterns of histone H3 and H4 (UNNIKRISHNAN et al. 2010). Multiply acetylated H3
and H4 were shown to be required for efficient origin activation during S phase, suggesting that
H4 hyperacetylation is an evolutionarily conserved mark of replication initiation. Using an
amplification reporter, we found that H4K8 hyperacetylation is necessary for origin activation in
a transgene, as tethering a histone deacetylase to DAFC-66D and eliminating enrichment of
acetylated H4K8 resulted in complete repression of amplification at stage 10. However,
acetylated H4 is not sufficient for amplification, as there are regions in the genome that show
enrichment by ChIP but are not amplified. For example, DAFC-22B displays similar enrichment
of tetra-acetylated H4 and acetylated H4K8 despite amplifying the region in OrRTOW and not
amplifying the region in OrRMOD. One explanation is that acetylated H4 creates a chromatin
environment to which ORC can bind, but other conditions are necessary for ORC to localize to a
region with acetylated H4. In the context of gene amplification, levels of acetylated H4 may
influence the number of initiation events that can occur from an ORC bound amplification
origin. During chromosomal replication, acetylated H4 and additional chromatin marks may
specify the efficiency of origin activation.
DAFC-22B provides a powerful opportunity to study the determinants of ORC binding in
metazoans since the region displays strain-specific amplification that is correlated with ORC
localization. One model is that DAFC-22B is amplified in some strains because something
permissive of the region to ORC binding has been gained. Conversely, the locus may not amplify
in other strains because some element that once made it permissive to ORC binding has been
lost. Nevertheless, amplification appears to have no effect on CG7337 expression in terms of
overall levels between the amplifying and non-amplifying strain, the reason for which remains an
enigma. DAFC-22B provides a unique model to investigate the determinants of ORC binding and
the relationship of gene amplification and transcription, highlighting the utility of genomic
approaches to uncover new model origins for molecular analysis.
MATERIALS AND METHODS
Comparative genomic hybridization
16C nuclei were isolated by FACS from OrRTOW and OrRMOD fattened females as
previously described (LILLY and SPRADLING 1996). Genomic DNA was prepared using the
DNeasy Blood and Tissue Kit (Qiagen), digested with Alul and RsaI, and labeled using
Invitrogen's BioPrime Total for Agilent aCGH labeling kit. Slides were hybridized to custom
Agilent tiling arrays with probes every 600 or 400 bp and washed as per Agilent
recommendations. Array intensities were median normalized across channels and smoothed by
genomic windows of 10 kb using the Ringo package in R (TOEDLING et al. 2007).
RNA-sequencing
Ovaries from two-day fattened OrRTOW females were dissected in Grace's medium
containing Hoechst, and follicle cells were isolated as described previously (BRYANT et aL.
1999). Follicle cells were sorted by FACS, and RNA from 16C follicle cells was extracted using
Trizol according to the manufacturer's protocol. 1OOng of RNA was processed with the Illumina
mRNA Sample Preparation kit and subject to DSN normalization according to the
manufacturer's protocol.
Chromatin immunoprecipiation
ChIP-qPCR was performed on 300 staged egg chambers per experiment as described
(XIE and ORR-WEAVER 2008). ChIP-chip was performed using four times the starting material.
All ChIP experiments were compared to input DNA. For hybridization to arrays, DNA was
labeled using Invitrogen's BioPrime Total for Agilent aCGH labeling kit. ChIP was performed
with the following antibodies: ORC2 (Steve Bell), tetra-acetylated H4 (Active Motif 39179),
acetylated H4K5 (Upstate 07-327), acetylated H4K8 (Upstate 07-328), and acetylated H4K12
(Upstate 07-595). Commercial antibodies used were ones validated by the ModENCODE
consortium for specificity.
Drosophila strains and heat-shock overexpression
Transgenic lines carrying the TT1, hspGAL4DBD::Rpd3 or hspGAL4DBD ::HAT1
transposons were a gift from Brian Calvi. Flies were crossed to introduce two transposons into
the same line: either TT1 and the Rpd3 fusion or TTl and the HATI fusion. Siblings that
contained only TT1 (TTl/+) were kept as controls. One hour heat-shock at 37'C was used to
overexpress the GAL4 fusion protein.
Quantitative PCR
Genomic DNA was isolated from staged egg chambers and quantified using absolute
quantitative PCR as described (CLAYCOMB et al. 2004) or relative quantification as described
(XIE
and ORR-WEAVER 2008). Absolute quantification was used for the DAFC-22B replication
profile whereas relative quantification was used for all other experiments. CG7337 expression
analysis was performed on RNA samples prepared using Trizol and reverse transcribed with
AMV reverse transcriptase (Promega). Purified follicle cells were isolated using a protocol
modified from Bryant et al. 200 stage 13 egg chambers were dissected in ice-cold Schneider's
medium supplemented with 10% FBS. Tissue was digested with 0.9mL of 0.25% Trypsin/EDTA
and 0. 1ml of 50mg/mL collagenase for 12 minutes at room temperature. The supernatant was
strained through a 40 tm mesh and spun at 1OOg for 7 minutes in the cold. Trizol was added to
the pellet for RNA isolation. For cDNA analysis, Rps] 7 was used as the endogenous control.
Transgenic fly construction
To test the cis requirements for amplification at DAFC-22B, we constructed transposons
with the 12 kb ORC binding region (stage 10) from OrRMOD (MOD4-9), the 12 kb ORC binding
region (stage 10) from OrRTOW (TOW4-9), and the 10 kb ORC binding region (cell culture) from
OrRT '(TOW8-10).
The gap in probes in the DAFC-22B aCGH amplification profile represents
repeated DNA sequence not present in either OrR strain. These sequences were flanked by
suppressor of Hairy wing binding sites (SHWBS) to control for genomic position-specific
integration effects. The sequences were PCR amplified using exTaq DNA polymerase (Takara)
and primers with AscI and AvrII sites on the forward and reverse sequences, respectively. These
products were cloned into a modified PCRA vector with AscI and AvrII sequences engineered
into the multiple cloning site (Lu et al. 2001). These plasmids are called PCRA_22BMOD4-9,
PCRA_22BTOW4-9, and PCRA_22BTOW8-10. These plasmids were digested with NotI and
subjected to a partial XhoI digest to transfer the 12 kb or 10 kb inserts to the NotI and XhoI sites
of Big Parent to generate BP_22BMOD4-9, BP_22BTOW4-9, and BP_22B_TOW8-10. These
constructs were sent to BestGene Inc (Chino Hills, CA) for injection.
ACKNOWLEDGEMENTS
We thank David MacAlpine for design of the microarray, Steve Bell for the ORC2
antibody, Brian Calvi for transgenic flies. George Bell provided helpful bioinformatics advice.
REFERENCES
AGGARWAL, B. D., and B. R. CALVI, 2004 Chromatin regulates origin activity in Drosophila
follicle cells. Nature 430: 372-376.
ALADJEM, M. I., 2007 Replication in context: dynamic regulation of DNA replication patterns in
metazoans. Nat Rev Genet 8: 588-600.
AUSTIN, R. J., T. L. ORR-WEAVER and S. P. BELL, 1999 Drosophila ORC specifically binds to
ACE3, an origin of DNA replication control element. Genes Dev 13: 2639-2649.
BEALL, E. L., J. R. MANAK, S. ZHOU, M. BELL, J. S. LIPSICK et al., 2002 Role for a Drosophila
Myb-containing protein complex in site-specific DNA replication. Nature 420: 833-837.
BELL, S. P., R. KOBAYASHI and B. STILLMAN, 1993 Yeast origin recognition complex functions
in transcription silencing and DNA replication. Science 262: 1844-1849.
Bosco, G., W. Du and T. L. ORR-WEAVER, 2001 DNA replication control through interaction of
E2F-RB and the origin recognition complex. Nat Cell Biol 3: 289-295.
BRYANT, Z., L. SUBRAHMANYAN, M. TWOROGER, L. LATRAY, C. R. Liu et al., 1999
Characterization of differentially expressed genes in purified Drosophila follicle cells:
toward a general strategy for cell type-specific developmental analysis. Proc Natl Acad
Sci U S A 96: 5559-5564.
CADORET, J. C., F. MEISCH, V. HASSAN-ZADEH, I. LUYTEN, C. GUILLET et al., 2008 Genomewide studies highlight indirect links between human replication origins and gene
regulation. Proc Natl Acad Sci U S A 105: 15837-15842.
CLAYCOMB, J. M., M. BENASUTTI, G. Bosco, D. D. FENGER and T. L. ORR-WEAVER, 2004 Gene
amplification as a developmental strategy: isolation of two developmental amplicons in
Drosophila. Dev Cell 6: 145-155.
CLAYCOMB, J. M., D. M. MACALPINE, J. G. EVANS, S. P. BELL and T. L. ORR-WEAVER, 2002
Visualization of replication initiation and elongation in Drosophila. J Cell Biol 159: 225236.
CLAYCOMB, J. M., and T. L. ORR-WEAVER, 2005 Developmental gene amplification: insights
into DNA replication and gene expression. Trends Genet 21: 149-162.
CVETIC, C., and J. C. WALTER, 2005 Eukaryotic origins of DNA replication: could you please be
more specific? Semin Cell Dev Biol 16: 343-353.
GILBERT, D. M., 2004 In search of the holy replicator. Nat Rev Mol Cell Biol 5: 848-855.
GRIFFIN-SHEA, R., G. THIREOS and F. C. KAFATOS, 1982 Organization of a cluster of four
chorion genes in Drosophila and its relationship to developmental expression and
amplification. Dev Biol 91: 325-336.
HAMLIN, J. L., L. D. MESNER and P. A. DIJKWEL, 2010 A winding road to origin discovery.
Chromosome Res 18: 45-61.
R. S., S. THOMAS, R. SANDSTROM, T. K. CANFIELD, R. E. THURMAN et al., 2010
Sequencing newly replicated DNA reveals widespread plasticity in human replication
timing. Proc Natl Acad Sci U S A 107: 139-144.
HANSEN,
HARTL, T., C. BOSWELL, T. L. ORR-WEAVER and G. BoSco, 2007 Developmentally regulated
histone modifications in Drosophila follicle cells: initiation of gene amplification is
associated with histone H3 and H4 hyperacetylation and HI phosphorylation.
Chromosoma 116: 197-214.
HUANG, D. W., L. FANTI, D. T. PAK, M. R. BOTCHAN, S. PIMPINELLI et al., 1998 Distinct
cytoplasmic and nuclear fractions of Drosophila heterochromatin protein 1: their
phosphorylation levels and associations with origin recognition complex proteins. J Cell
Biol 142: 307-318.
KARNANI, N., C. M. TAYLOR, A. MALHOTRA and A. DUTTA, 2010 Genomic study of replication
initiation in human chromosomes reveals the influence of transcription regulation and
chromatin structure on origin selection. Mol Biol Cell 21: 393-404.
KOUZARIDES, T., 2007 Chromatin modifications and their function. Cell 128: 693-705.
LANDIS, G., R. KELLEY, A. C. SPRADLING and J. TOWER, 1997 The k43 gene, required for
chorion gene amplification and diploid cell chromosome replication, encodes the
Drosophila homolog of yeast origin recognition complex subunit 2. Proc Natl Acad Sci U
S A 94: 3888-3892.
LANDIS, G., and J. TOWER, 1999 The Drosophila chiffon gene is required for chorion gene
amplification, and is related to the yeast Dbf4 regulator of DNA replication and cell
cycle. Development 126: 4281-4293.
LILLY, M. A., and A. C. SPRADLING, 1996 The Drosophila endocycle is controlled by Cyclin E
and lacks a checkpoint ensuring S-phase completion. Genes Dev 10: 2514-2526.
Lu, L., and J. TOWER, 1997 A transcriptional insulator element, the su(Hw) binding site, protects
a chromosomal DNA replication origin from position effects. Mol Cell Biol 17: 22022206.
Lu, L., H. ZHANG and J. TOWER, 2001 Functionally distinct, sequence-specific replicator and
origin elements are required for Drosophila chorion gene amplification. Genes Dev 15:
134-146.
LUNYAK, V. V., M. EZROKHI, H. S. SMITH and S. A. GERBI, 2002 Developmental changes in the
Sciara II/9A initiation zone for DNA replication. Mol Cell Biol 22: 8426-8437.
MACALPINE, D. M., H. K. RODRIGUEZ and S. P. BELL, 2004 Coordination of replication and
transcription along a Drosophila chromosome. Genes Dev 18: 3094-3105.
MACALPINE, H. K., R. GORDAN, S. K. POWELL, A. J. HARTEMINK and D. M. MACALPINE, 2010
Drosophila ORC localizes to open chromatin and marks sites of cohesin complex
loading. Genome Res 20: 201-211.
MINDRINOS, M. N., L. J. SCHERER, F. J. GARCINI, H. KWAN, K. A. JACOBS et al., 1985 Isolation
and chromosomal location of putative vitelline membrane genes in Drosophila
melanogaster. Embo J 4: 147-153.
0. L. MILLER, JR. and A. L. BEYER, 1988 Visualization of Drosophila
melanogaster chorion genes undergoing amplification. Mol Cell Biol 8: 2811-2821.
PAK, D. T., M. PFLUMM, I. CHESNOKOV, D. W. HUANG, R. KELLUM et al., 1997 Association of
the origin recognition complex with heterochromatin and HP1 in higher eukaryotes. Cell
91: 311-323.
PARKS, S., and A. SPRADLING, 1987 Spatially regulated expression of chorion genes during
Drosophila oogenesis. Genes Dev 1: 497-509.
OSHEIM, Y. N.,
PARKS, S., B. WAKIMOTO and A. SPRADLING, 1986 Replication and expression of an X-linked
cluster of Drosophila chorion genes. Dev Biol 117: 294-305.
SCHWED, G., N. MAY, Y. PECHERSKY and B. R. CALVI, 2002 Drosophila minichromosome
maintenance 6 is required for chorion gene amplification and genomic replication. Mol
Biol Cell 13: 607-620.
SEQUEIRA-MENDES, J., R. DIAZ-URIARTE, A. APEDAILE, D. HUNTLEY, N. BROCKDORFF et al.,
2009 Transcription initiation activity sets replication origin efficiency in mammalian
cells. PLoS Genet 5: e1000446.
SPRADLING, A. C., 1981 The organization and amplification of two chromosomal domains
containing Drosophila chorion genes. Cell 27: 193-201.
TAKAHASHI, T. S., P. Yiu, M. F. CHOU, S. GYGI and J. C. WALTER, 2004 Recruitment of Xenopus
Scc2 and cohesin to chromatin requires the pre-replication complex. Nat Cell Biol 6:
991-996.
TOEDLING, J., 0. SKYLAR, T. KRUEGER, J. J. FISCHER, S. SPERLING et al., 2007 Ringo--an
R/Bioconductor package for analyzing ChIP-chip readouts. BMC Bioinformatics 8: 221.
UNNIKRISHNAN, A., P. R. GAFKEN and T. TSUKIYAMA, 2010 Dynamic changes in histone
acetylation regulate origins of DNA replication. Nat Struct Mol Biol 17: 430-437.
WHITE, E. J., 0. EMANUELSSON, D. SCALZO, T. ROYCE, S. KOSAK et al., 2004 DNA replicationtiming analysis of human chromosome 22 at high resolution and different developmental
states. Proc Natl Acad Sci U S A 101: 17771-17776.
XIE, F., and T. L. ORR-WEAVER, 2008 Isolation of a Drosophila amplification origin
developmentally activated by transcription. Proc Natl Acad Sci U S A 105: 9651-9656.
Chapter Three:
Differential ORC localization during two rounds of replication
initiation at a Drosophila follicle cell amplicon
Jane C. Kim and Terry L. Orr-Weaver
Whitehead Institute and Dept. of Biology, Massachusetts Institute of Technology
Cambridge, MA 02142
ABSTRACT
We investigated the developmental and replication properties of a newly identified follicle cell
amplicon, DAFC-34B. DAFC-34B contains two genes that are expressed in follicle cells, though
their timing and spatial patterns of expression suggest that amplification is not a strategy to
promote high levels of expression at this locus. Vm34Ca is a structural component of the vitelline
membrane but is expressed prior to the onset of gene amplification. CG16956 is expressed in
amplification stages but only in a small subset of follicle cells. Like the previously characterized
DAFC-62D, DAFC-34B displays origin firing at two separate stages of development. However,
unlike DAFC-62D, amplification at the later stage is not transcription dependent. We mapped the
DAFC-34B amplification origin to 1kb by nascent strand analysis and delineated the cis
requirements for origin activity, finding that a 6 kb region, but not the 1 kb origin alone, is
sufficient for amplification. We analyzed the developmental localization of ORC, the origin
recognition complex, and the MCM complex, the replicative helicase. Intriguingly, the final
round of origin activation at DAFC-34B occurs in the absence of detectable ORC, though MCMs
are present, suggesting a new amplification initiation mechanism.
100
INTRODUCTION
The initiation of DNA replication is a critical regulatory step for complete duplication of
the genome during S phase. In eukaryotes, DNA replication initiates from hundreds to thousands
of sites, called replication origins, across the genome. Bidirectional replication proceeds from
these initiation sites until the replication forks from adjacent origins converge and the genome is
fully replicated. In metazoans, there is no sequence-specific motif for specification of origin
activity or localization of the origin recognition complex (ORC), the essential replication
initiation factor that is conserved in all eukaryotes (CVETIC and WALTER 2005; GILBERT 2004).
Origins individually identified in cell culture such as the dihydrofolate reductase (DHFR) locus
in Chinese hamster ovary (CHO) cells and the human
s-globin locus, among several others, have
been the molecular workhorses for replication origin studies (ALADJEM 2007). The DHFR locus
is an example of an extended replication zone where origin activation can occur from one of
multiple inefficient origins, whereas the p-globin locus displays a high frequency of initiation
from a confined site (HAMLIN et al. 2010; KITSBERG et al. 1993).
Recently, there has been a tremendous increase in the number of metazoan origins
identified due to the application of methods to select origin-centered DNA to microarray or highthroughput sequencing technologies (CADORET et al. 2008; HANSEN et al. 2010; KARNANI et al.
2010; LUCAS et al. 2007; MACALPINE et al. 2010; SEQUEIRA-MENDES et al. 2009). These
methods include isolation of small DNA fragments (0.5 kb to 2 kb) combined with Xexonuclease treatment to purify short nascent strands, which are protected from digestion by
their 5' RNA primers, as well as pulsing cells with the nucleotide analog bromodeoxyuridine
(BrdU) and immunoprecipitating newly synthesized DNA with an anti-BrdU antibody. In
addition, origin activation produces a replication bubble, and these circular DNA structures can
101
be selectively trapped in gelling agarose, cloned, and identified (MESNER et al. 2006).
Replication origins can also be identified using chromatin immunoprecipitation (ChIP) of ORC
because this complex marks all potential sites of origin activation. The strength of these
approaches lies in the ability to examine hundreds to thousands of mapped origins and ORC
binding sites to look for relationships to genome-wide properties including transcription,
epigenetic modification, and the coordination of replication timing. The picture emerging from
these studies is that origin activation and ORC binding are significantly influenced by an open
chromatin structure, showing a strong correlation to transcription start sites, particularly active
promoters, as well as DNaseI hypersensitive sites and histone modifications associated with
active transcription. Genome regions with a high density of these features tend to be early
replicating in S phase. However, a common limitation of these studies is that they rely on cell
culture, and it is not possible to study events directly at the time of origin activation.
Gene amplification in Drosophila follicle cell is an excellent in vivo model for
investigating the regulation of DNA replication initiation (CLAYCOMB and ORR-WEAVER 2005).
Gene amplification occurs by a process of repeated origin activation and bidirectional fork
progression resulting in a gradient of replicated DNA that spans approximately 100 kb. Genes
that encode structural components of the eggshell and eggshell cross-linking enzymes are located
in the peak of amplification at several amplicons (CLAYCOMB et al. 2004; SPRADLING 1981).
This increase in genomic template enables the eggshell to be constructed in a short
developmental period. Importantly for its use as a replication model, gene amplification relies on
the same replication machinery and cell cycle kinase regulation that is used in the typical
eukaryotic S phase.
102
Analyses of two Drosophila Amplicons in Follicle Cells, DAFC-66D and DAFC-62D,
reveal distinct developmental and regulatory strategies. At the major chorion amplicon DAFC66D, replication initiation and exclusive elongation occur in distinct phases with origin
activation events confined to stages 10B and 11, followed by elongation of existing replication
forks in stages 12 and 13 (CLAYCOMB et al. 2002). In contrast, DAFC-62D displays two stages of
replication initiation, one at stage 1OB and another at stage 13, with elongation occurring in the
intervening stages (CLAYCOMB et al. 2004). Furthermore, whereas DAFC-66D amplification can
be delineated to the cis interaction of a 320 base pair amplification enhancer A CE3 on a major
884 base pair replication origin Orip(Lu et al. 2001), ectopic amplification of DAFC-62D
cannot be narrowed down to a region smaller than 10 kb (XIE and ORR-WEAVER 2008). In
addition, the second phase of replication initiation at DAFC-62D is surprisingly transcription
dependent, requiring active transcription for loading of the MCM helicase, though not ORC
localization (XIE and ORR-WEAVER 2008). These studies suggest that elucidating the
developmental and regulatory strategies of additional follicle cell amplicons will uncover new
mechanisms regulating DNA replication initiation.
Recently, all follicle cell amplicons were identified in Drosophila using an array based
comparative genomic hybridization (aCGH) approach, which uncovered two new amplicons
(Chapter 2). This study examined the relationship of gene amplification to transcription, ORC
localization, and histone H4 acetylation on a genome-wide scale revealing that amplified regions
often, though not always, contain highly expressed genes. ORC localizes to the most amplified
region, making the amplicons useful replication models, and amplification levels correlate with
acetylated H4 levels. Here we analyze DAFC-34B in close detail and find yet a distinct example
of developmental and regulatory control strategies from DAFC-66D and DAFC-62D.
103
RESULTS
Two genes in DAFC-34B are expressed in follicle cells
DAFC-34B was recently identified as a new amplicon using an aCGH strategy to uncover
all follicle cell amplicons (Chapter 2). After undergoing three rounds of endoreduplication,
chromosomal replication without intervening mitoses, to reach 16C copy levels, follicle cells
initiate synchronous gene amplification at stage 1OB. 16C follicle cells contain egg chambers
from stages 9 through 14 and are enriched for amplification stages. Whole genome tiling arrays
were competitively hybridized with DNA from 16C nuclei and diploid early embryos to uncover
regions of follicle cell amplification. Additionally, Illumina RNA-sequencing of 16C follicle
cells showed that several genes in the amplified region are expressed and at least one, Vm34Ca, a
vitelline membrane gene located in the peak of amplification, at very high levels (Figure 3-1A).
To assess the timing and spatial expression pattern of genes in the amplified region more
precisely, we performed in situ hybridization of nine genes in the central amplified region.
Consistent with the RNA-seq data, Vm34Ca was highly expressed in follicle cells. However, this
gene was expressed beginning in stage 8, prior to the onset of synchronous gene amplification,
and continued until stage lOB (Figure 3-1B), consistent with previous expression analysis by
Northern blot (MINDRINOS et al. 1985).
Of the remaining genes we examined, only one was expressed in follicle cells by in situ
hybridization. CG16956 is expressed in stages 12 and 13, but only in a small subset of follicle
cells at the anterior region (Figure 3-1C), which was similar to the late stage expression patterns
of yellow-g and yellow-g2 in DAFC-62D (CLAYCOMB et al. 2004). Gene amplification is
generally considered a strategy to promote high levels of gene expression. Because the
expression patterns of Vm34Ca and CG16956 did not conform to this simple model, showing
104
Figure 3-1. Two genes located in DAFC-34B are expressed in follicle cells.
100 kb amplified region of DAFC-34B with aCGH in log2 ratio and RNA-seq data in number of
mapped reads (A). Nine genes were tested by in situ hybridization for expression in follicle cells.
The genes in black were tested but did not show any signal in follicle cells. Vm34Ca is expressed
broadly in all follicle cells from stages 8, prior to the onset of synchronous gene amplification,
through stage 10 B (B). CG1 6956 is expressed exclusively in a small population of cells at the
anterior region. Stage 12 is above, and stage 13 is below in (C).
105
..
....
...
. ...
..
Figure 3-1
A
50kb,
~
0 ..........HI~hIlililIllI I
100
134500001
134000001
133500001
IIhIIIlIlI I II I llIIIIIIIIElIlIIIhIIII
CG65650
CG1 68263 CG9377
CG65231
C99m
Nnp-1 IM
CG7099
Nnp12
CG23 SO
CG9293
CG31855
CG31855 0
CG9306
CG9350
CG93020
beta!Cop
I11ll11
m34Ca
7110qa
lI
CT
ou.CG169561
i li il
TehaoC
6N
CG16849H
CG31846
CG9267
HS6D*
illaIN1lis
bNoss
N..
CG16850E1
RpL24I
CG169571
CGI08590
CG16956
106
high expression levels prior to the start of synchronous amplification and being expressed in a
limited number of cells, we investigated the precise timing and possible cell population
specificity of replication events.
DAFC-34B shows two distinct stages of replication initiation
Drosophila egg chambers are morphologically distinct and thus allow origin activation to
be analyzed in the context of developmental progression. To determine the replication profile at
DAFC-34B, we hand sorted individual egg chamber stages to isolate genomic DNA and assessed
DNA copy levels by quantitative real-time PCR (qPCR) compared to a non-amplified locus. This
approach allows initiation to be distinguished from elongation based on the shape of the
replication profile: specifically, whether there is an increase in DNA copy number at the most
amplified region or only in the flanking regions. Because Vm34Ca is expressed beginning in
stage 8, one possibility was that gene amplification begins in an earlier stage at DAFC-34B,
which had not been previously recognized in BrdU immunofluorescence experiments. However,
at stage 9, no amplification was detected at DAFC-34B (Figure 3-2A), confirming that Vm34Ca
expression occurs prior to amplification. At stage lOB, we observed an increase in copy number
to 4-fold, showing that replication initiation occurs at this stage (Figure 3-2C). At stage 11, there
was no further increase in copy number, but a widening of the replication profile, indicating the
process of replication elongation (Figure 3-2D). At stage 12, we saw a further doubling of DNA
copy number, revealing a second phase of DNA replication initiation, followed again by
elongation at stage 13 (Figure 3-2E and 3-2F). Thus, DAFC-34B shows two stages of replication
initiation in the same region with a period of only elongation in between, which is similar to the
previously characterized DAFC-62D.
107
Figure 3-2. DAFC-34B exhibits two stages of replication initiation.
Genomic DNA was isolated from hand-sorted staged egg chambers, and DNA copy levels were
quantified by qPCR compared to a non-amplified locus at 62C5 (A-G). Error bars show standard
deviation for triplicate reactions. Genomic position is shown on the X axis (13380 is
Chr2L: 13,380,000). Although Vm34Ca begins to be expressed at stage 8, gene amplification is
not first observed until stage 10A, when there is a two-fold increase in DNA copy number (B).
Replication initiation occurs in stage 1OB to reach approximately four-fold amplification (C).
There is another period of replication initiation at late stage 12, resulting in a doubling in copy
number at DAFC-34B (E).
108
Figure 3-2
DAFC-34B Stage 1OA
DAFC-34B Stage 9
I
13380
13440
13420
13400
Genomic Position
13380
13420
13400
Genomic Position
13440
DAFC-34B Stage 11
DAFC-34B Stage 10B
421338
13440
1340
134
Genomic Position
13380
-
13;W
r
-
13420
13400
Genomic Position
13iW
13440
DAFC-34B
-
Stage 9
+ Stage 1OA
+ Stage 1OB
-
Stage 11
+ Stage 12
+Stage 13
U|
13380
13440
DAFC-34B Stage 13
DAFC-34B Stage 12
-- U--
13400
13420
Genomic Position
|
13400
13420
Genomic Position
13440
109
1320
130
Genomic Position
13440
The timing of CG] 6956 expression in stages 12 and 13 was consistent with the timing of
gene amplification, but the expression pattern in a small number of cells raised the possibility
that DAFC-34B might be differentially amplified among follicle cell populations. To address
whether CG16956-expressingcells amplify this genomic region to higher levels than other
follicle cell populations, we used two approaches. First, the expression pattern of CG16956 was
reminiscent of slow borders (slbo), a gene expressed in and required for the migration of border
cells (MONTELL et al. 1992). Border cells are a small group of cells necessary for proper
development of a functional micropyle, or sperm-entry structure in the anterior region, and
CG16956 expression co-localized with the slbo marker (Figure 3-3A). We isolated slboexpressing cells by FACS, using the GAL4-UAS system to drive GFP expression with the slbo
regulatory region (slbo-GAL4) (WANG et al. 2006). When compared to DNA recovered using a
ubiquitous follicle cell driver (c323a), slbo-positive cells did not display higher gene
amplification levels (Figure 3-3B). In fact, when we examined the top 20% of GFP expressing
cells, slbo-positive cells showed lower levels of gene amplification than the total cell population
driven by c323a. Second, we divided stage 13 egg chambers into anterior and posterior regions
by hand and performed qPCR on the genomic DNA samples. The anterior region did not display
more abundant amplification of DAFC-34B and reproducibly showed lower DNA copy number
than either the posterior region or whole egg chambers (Figure 3-3C). Thus, DAFC-34B is not
amplified to higher levels in CG16956-expressing cells or anterior follicle cells compared to the
rest of the follicle cell population.
We next addressed whether transcription influenced DAFC-34B amplification. At DAFC62D, the final round of initiation is dependent on transcription, as culturing egg chambers in the
presence of the drug a-amanitin, which blocks RNAPII translocation, for 5 hours specifically
110
Figure 3-3. Follicle cells expressing CG16956 do not selectively or more greatly amplify DAFC34B.
RNA fluorescent in situ hybridization was performed along with a-GFP immunofluorescence on
slbo-GAL4, UAS-GFP ovaries to determine co-localization of CG] 6956 and the border cell
specific sibo marker (A). CG16956 is expressed in slbo-positive cells. GFP-positive, slboexpressing cell were recovered by FACS, and genomic DNA from these samples was compared
to GFP-positive cells driven by the ubiquitous follicle cell driver, c323a. Results for total GFPpositive cells and the top 20% of GFP-positive cells are shown in (B). Stage 13 egg chambers
were hand-dissected into anterior and posterior regions, and genomic DNA from these samples
was examined by qPCR.
111
...........
- -............
....
.....
.......
.....................
: ::.:
-
11 -:
". ".. --
Figure 3-3
A
DAFC-66D
DAFC-34B
-- T
/
I
-
0
DAFC-66D
DAFC-34B
40
.4'
'b
c~.
C,
112
I'll, .
.. "
blocked stage 13 origin activation but had no effect on stage 10 amplification (XIE and ORRWEAVER 2008). As both DAFC-34B and DAFC-62D exhibit two distinct stages of replication
initiation, we tested the effect of transcription inhibition on DAFC-34B amplification to see if
these late stage amplification events occur by the same mechanism. Like DAFC-62D, we found
that five hour a-amanitin treatment had no effect on stage 1OB origin activation at DAFC-34B
(Figure 3-4A). However, whereas a-amanitin specifically blocked stage 13 origin activation at
DAFC-62D, it had no effect on late stage origin activation at DAFC-34B (Figure 3-4B),
indicating a different mode of regulation.
DAFC-34B amplification origin corresponds to the Vm34Ca transcription unit
The amplification origins of DAFC-66D and DAFC-62D have been mapped using
various methods, and the results show distinct origin positions with respect to the surrounding
genes. The major origin at the chorion amplicon DAFC-66D is intergenic, though still residing in
a gene-rich region (HECK and SPRADLING 1990), whereas the origin at DAFC-62D coincides
with the yellow-g2 gene (XIE and ORR-WEAVER 2008). We used nascent strand analysis to map
the origin at DAFC-34B. We hand sorted stage 10 egg chambers, enriched for replication
intermediates, and size fractionated DNA fragments by gel electrophoresis. Short nascent strands
were further enriched using -exonuclease treatment to remove nicked DNA that lacks 5' RNA
primers. Using this method, we found that the DAFC-34B amplification origin coincides with the
transcription unit of Vm34Ca (Figure 3-5A). We found that this gene region was highly enriched
in the 0.7 to 1.5 kb fractions of DNA in three biological replicates. As a control for the efficiency
of nuclease digestion and the recovery of short nascent strands, we found that DNA greater than
5 kb in size showed uniform low enrichment levels across the 10 kb most amplified region of
113
Figure 3-4. Transcription inhibition with a-amanitin has no effect on DAFC-34B amplification.
Whole ovaries were in vitro cultured with a-amanitin for 5 hours, upon which staged egg
chambers were hand sorted. qPCR quantification of stage 10 and stage 13 are shown for DAFC34B and DAFC-62D. At stage 10, a-amanitin treatment has no effect on amplification at either
amplicon (A). At stage 13, a-amanitin treatment specifically inhibits late amplification at DAFC62D but has no effect at DAFC-34B (B).
114
Figure 3-4
A
Stage 10
- alpha-amanitin
+ alpha-amanitin
I'
-Li
0~
04
Stage 13
- alpha-amanitin
+ alpha-amanitin
Q
115
DAFC-34B (Figure 3-5B). Notably, we used probes to detect for nascent strand abundance in the
other gene regions located in DAFC-34B but did not observe any enrichment (Figure 3-5C).
Developmental control of ORC and MCM localization at DAFC-34B
Because DAFC-34B exhibits two rounds of replication initiation, we examined the stage
specific localization of ORC at this genomic locus to see how it corresponded to origin firing.
Genome-wide localization using ORC2 ChIP-chip in stage 10 egg chambers showed that ORC
localizes to all follicle cell amplicons and at DAFC-34B, in an approximate 10 kb zone centered
at the peak of amplification (Chapter 2 and Figure 3-6A). We confirmed this result by ChIPqPCR using independent biological samples and saw the same profile of ORC localization for
DAFC-34B as with ChIP-chip (Figure 3-6B). With both methods, we observed a sharp boundary
of ORC binding that coincided with Vm34Ca, which also corresponds to the origin we identified
by nascent strand analysis. Strikingly, when we examined subsequent stages for ORC
localization at DAFC-34B, we found that ORC enrichment was absent despite the second round
of replication initiation. We performed ORC2 ChIP-chip for pooled stage 11 and 12 egg
chambers and found that ORC was not detectable at DAFC-34B. In contrast, we observed that
ORC was present at DAFC-62D, as it has been previously reported by ChIP-qPCR (XIE and
ORR-WEAVER
2008) (Figure 3-6A). We confirmed this result for pooled stages 11/12 as well as
examined ORC localization at DAFC-34B and DAFC-62D at stage 13 by ChIP-qPCR and found
that ORC was not enriched at DAFC-34B and enriched at DAFC-62D in the later stages of egg
chamber development (Figure 3-6C).
We examined the localization of the MCM2-7 replicative helicase by ChIP and found
that the MCM complex localizes to DAFC-34B, coincident with both stages of replication
initiation. At stage 10, we observed that MCMs localize to DAFC-34B by ChIP-chip (Figure 3116
Figure 3-5. DAFC-34B amplification origin maps to Vm34Ca transcription unit by nascent
strand analysis.
Size fractionated nascent DNA was isolated from stage 10 egg chambers, and the abundance of
short nascent strands from the 1 to 1.5 kb fraction (plus -exonuclease treatment) was quantified
by qPCR compared to a non-amplified locus (A). Genomic position is shown on the X axis
(8000 is Chr2L: 13,408,000). There is a peak of nascent strand enrichment corresponding to the
Vm34Ca transcription unit. This enrichment is absent in the 5 kb and up fraction (minus Xexonuclease treatment) (B). The region was assessed using additional probes in the 0.7 to 1 kb
(plus -exonuclease treatment) fraction, and the enrichment was specific to the Vm34Ca gene
and not adjacent gene regions (C).
117
.....
......
......
............
.
Figure 3-5
A
1 100001
13415000
1
13420000
1
13400001
vmS4CdDGlSp
CG168U4
Nascent Strand (1-1.5kb)+Iambda
Nascent Strand (5kb and up)-lambda
60-1
12000
16000
20000
0.1
800
24000
Genomic Position
0
-
11000
12000
13000
Genomic Position
CG7110
Vm34Ca
16000
20000
Genomic Position
Nascent Strand (0.7-1 kb)+Iambda
0
,
10000
12000
CG16848
118
24000
28000
Figure 3-6. ORC is not detectable at DAFC-34B after stage 10 despite a late stage of replication
initiation.
ChIP-chip experiments using antibodies specific for ORC2 and MCM2-7 were performed on
stage 10 as well as pooled stages 11 and 12 egg chambers. The amplified regions for DAFC-34B,
DAFC-62D, and DAFC-66D are shown. ORC localizes to DAFC-34B in a 10 kb zone with a
boundary of ORC binding corresponding to the Vm34Ca transcription unit. These results are
consistent with quantification by ChIP-qPCR (B). After stage 10, there is no localization of ORC
at DAFC-34B despite enrichment at DAFC-62D (A,C) , which also exhibits late amplification.
DAFC-66D, which shows replication initiation at stages 10 and 11, is shown for comparison.
Although ORC is not detected at DAFC-34B after stage 10, MCM enrichment was observed at
stage 13 using ChIP-qPCR (D).
119
Figure 3-6A
DAFC-34B
00 kb
133800001
13400M001
1
DAFC-62D
4
13440000
134600001
13420M01
22400001
kb .20
22800001
DAFC-66D
2200M001
2328001
2380000
8Bo00001
00
kb
87000001
8728000
87600001
$740M001
CGH
00212,
co.~ym
Stage 10
ORC ChIP
"I"L. 1.1-I'll
0
G1682014
b.0~
T.[C009
00127
07080i
.
1116 .1.......
11
.61
Stage 10
MCM ChIP
Stage 11/12
ORC ChIP
3-
Stage 11/12
MCM ChIP
o
-
i...a
bm
i.
a
Wi
lk.,101111--,IIAA,,,
IIIIHIP11111"J.I
yd
2
2?56.
DIU
OG13002U
654
0002.....U
-A - _
-II
-
-
l.
CGXMO
3306
----------.. ..
.....
....
Figure 3-6B
DAFC-34B Stage 10 ORC ChIP
1210-
8-
'Lii
64-
I
I
II..
10000
12000
14000
16000
18600
20000
Genomic Position
CG71
10
a
G16848 n
Vm34Ca
"~CG1 6956
DAFC-62D
Stag. 13 ORC ChIP
-
oi62
genomic position
Stage 11112 MCM2-7 ChIP
Stage 13 MCM2-7 ChIP
24
24
20
115
Ic 2
-C 2
c
5 a
* 4--
DAFC-62D
DAFC-34B (1)
DAFC-34B (2)
genomic position
DAFC-62D
DAFC-34B (1)
genomic position
121
DAFC-348 (2)
6A). Furthermore, we observed MCM enrichment at stage 13 by ChIP-qPCR (Figure 3-6D).
Thus, the late stage initiation at DAFC-34B does not occur in the complete absence of all prereplicative (pre-RC) complex components. We did not observe MCM enrichment in pooled
stages 11 and 12 egg chambers at DAFC-34B, indicating that a pool of MCMs pre-loaded at
stage 10 is not responsible for later origin activation. In contrast, at DAFC-66D, which
undergoes replication initiation at stage 11, we observed significant MCM enrichment in pooled
stages 11 and 12. DAFC-34B demonstrates an example of replication initiation that occurs in the
absence of detectable ORC despite MCM loading.
ORC mutant blocks both rounds of initiation at DAFC-34B
Given the finding of replication initiation in the absence of ORC, we tested genetically
whether each period of origin activation is dependent on functional ORC. orc2f
is a female-
sterile allele that specifically reduces follicle gene amplification, and orc2k4 3 is a null allele
(LANDIS et al. 1997). We reasoned that, as ORC is absent for the late stage amplification, it
might not be required for this round of replication initiation, and we would observe amplification
specifically in the late stage egg chambers. When we examined DNA copy levels for stages 1OB
and 13 egg chambers by qPCR, however, the mutant effect on gene amplification at DAFC-34B
was comparable to the other two amplicons examined (Figure 3-7). All three displayed 1.5 to 2fold enrichment at stage 13, likely due to low levels of ORC activity. These results indicate that
both rounds of replication initiation require ORC, though initiation in the later stages in its
absence could be explained by some earlier action of ORC to promote replication initiation. For
example, the first round of replication initiation at DAFC-34B may set up conditions that permit
the final origin firing in the absence of ORC function.
122
Figure 3-7. ORC function is required to allow both stages of replication initiation at DAFC-34B.
DNA copy number was examined in a female-sterile allele of ORC, orc2
, which reduces
gene amplification. orc2k43 is a null allele. Amplification in the sibling heterozygotes are shown
43
in (A) and (B) showing typical amplification levels. In orc2/2 93 /orc2k , there was no specific
increase in gene copy number by stage 13 at DAFC-34B. There was 1.5 to 2 fold amplification
by stage 13 at DAFC-34B, DAFC-62D, and DAFC-66D, likely due to low levels of ORC
activity.
123
-,. -, -
.........
.
:
Figure 3-7
A
Gene Amplification in Orc2/+
Gene Amplification in Orc2/+
DAFC-34B
DAFC-62D
46-
E3 DAFC-66D
2-
V
IIB
I
Ao
,61
cp
I
I
&
4x
sk
b-~
4
Gene Amplification in Orc2 Mutants
4
m
as
c
3ME
DAFC-34B
DAFC-62D
DAFC-66D
282
I~
I
,Ao
4$'OI%'
,p,
Kb
0*~ -I------
'It'
r
Jl'o
'
124
*1
Delineation of cis control elements for replication at DAFC-34B
To delineate the cis requirements for amplification at DAFC-34B and in particular, to
determine if distinct control elements necessary for early versus late stage initiation could be
defined, we used P element mediated transformation to test sufficiency for amplification at an
ectopic site. We used Suppressor of Hairy-wing insulator binding sites (SHWBS) to flank the
test sequences and minimize inhibitory effects on gene amplification due to insertion position
(Lu and TOWER 1997). We tested the 1 kb origin sequence alone and found that it was
insufficient for amplification, indicating that additional sequences, possibly conferring
replication enhancer activity, are necessary (Figure 3-8). When we examined a transformed 10
kb region spanning the stage 10 ORC binding zone by qPCR, we found that this transformant
line showed the same timing and magnitude of amplification as the endogenous DAFC-34B
locus, indicating that this region contains all of the information necessary for amplification
(Figure 3-8).
Previous analyses of DAFC-66D and DAFC-62D demonstrated that not all follicle cell
amplicons could be reduced to equivalent replication control elements. At DAFC-66D the
interaction of ACE3 and Orip is sufficient for amplification, whereas at DAFC-62D a region
sufficient for amplification cannot be limited to smaller than 10 kb. DAFC-34B displays an
intermediate result. We found that a 6 kb region that spans the peak of ORC binding is sufficient
for amplification. However, neither a 2.6 kb fragment containing the DAFC-34B origin nor a 1.8
kb fragment spanning the peak of ORC binding were sufficient for amplification. With the
sequences we tested, we did not separate replication control elements that were responsible for
the two developmental stages of gene amplification.
125
Figure 3-8. A 6 kb ORC binding region is sufficient for amplification of DAFC-34B.
DNA sequences tested by P element mediated transformation are shown in (A) with the results
of ectopic amplification in (B). Neither the 1 kb origin alone nor a 2.1 kb region containing the
origin is sufficient for amplification. The 6 kb region containing the 1 kb origin as well as the
peak of ORC binding is the minimal sequence that conferred amplification activity.
126
Figure 3-8
A
3.
)
G
134190001134200001 134210001
134180001
134150001134160001 134170001
134140001
0134110001134120001 134130001
0
-
DAFC-34B 1kb
DAFC-34B 10kb
DAFC-34B_6kb
DAFC-34B 2.1kb
DAFC-34B 1.8kb
Replication Initiation
Construct
No. of lines
tested
DAFC-34B 1kb
DAFC-34B 10kb
DAFC-34B 6kb
DAFC-34B 2.1kb
DAFC-34B 1.8kb
127
Stage 10
Stage 13
DISCUSSION
We investigated the developmental and regulatory strategies of DAFC-34B and find that
it displays unique characteristics, making it a valuable new replication model to delineate the
relationship of transcription and gene amplification as well as the requirements of ORC function
for gene amplification. Though the amplicon contains genes expressed in follicle cells, the
expression pattern with regard to timing and spatial restriction are previously unobserved among
other amplicons. Despite having two rounds of replication initiation, we find that the second
round of origin firing is not dependent on transcription, as it is in DAFC-62D. Strikingly, we find
that the second round of replication initiation occurs in the absence of detectable ORC, though
MCMs are loaded. Finally, we delineate the cis requirements for replication and find that a 6 kb
minimal region is sufficient for origin function.
DAFC-34B contains two genes that are expressed specifically in follicle cells, but why
these genes are amplified remains an enigma. Vm34Ca is expressed beginning in stage 8 prior to
the onset of gene amplification, and CG16956 is expressed only in a small subset of follicle cells
at the anterior region during amplification stages. We performed in situ hybridization in
replication factor mutants but did not observe any detectable changes in expression level, spatial
distribution, or temporal profile of either Vm34Ca or CG16956 (data not shown), suggesting that
amplification is not necessary for adequate expression. In addition, we used RNAi lines to test
whether Vm34Ca or CG16956 are required for eggshell formation or fertility (DIETZL et al.
2007). The line targeting Vm34Ca has two off-targets that are homologous vitelline membrane
genes, and expression of this RNAi construct with a ubiquitous follicle cell driver resulted in thin
eggshells and failure of embryos to hatch. In contrast, expression of the RNAi construct targeting
CG16956, with no reported off-targets, had no detectable phenotype despite a 60% reduction in
expression levels (data not shown).
128
Recent studies in cell culture have begun to identify replication origins on a genome-wide
scale, finding correlations to other genomic processes and features such as transcription and
histone modifications (KARNANI et al. 2010; MAcALPINE et al. 2010; SEQUEIRA-MENDES et al.
2009). Despite their diverse approaches, one of the common findings among these studies is that
replication origins and ORC binding sites significantly correspond to gene regions, specifically
the transcription start site of actively transcribed genes. It has been proposed that the lack of
sequence specificity for ORC binding in metazoans, and rather the reliance on open and active
chromatin to specify origins, may serve to ensure that origin selection can change according to
developmental stage and cell type (MACALPINE et al. 2010). However, the mechanism of how
active transcription, in many but not all cases, enables a DNA sequence to function as a
replication origin requires analysis of individual replication origins.
Consistent with the majority of origins identified in cell culture, we find that the DAFC34B amplification origin corresponds to the transcription unit of Vm34Ca. We used nascent
strand analysis to map the DAFC-34B amplification origin, though due to the small size of the
Vm34ca transcription unit (455 bp) and the size fraction of the nascent DNA we used (0.7 to 1.5
kb), we could not precisely distinguish whether the origin mapped to the gene region or TSS.
The coincidence of the amplification origin with the Vm34Ca transcription unit makes DAFC34B an excellent model to test directly the role of active transcription on gene amplification. One
model for amplification at DAFC-34B is that abundant transcription, creating a chromatin
environment permissible for ORC binding, promotes amplification of this region. For example,
metazoan ORC has been shown to preferentially bind negatively supercoiled DNA (REMUS et al.
2004), which can be found directly upstream of active promoters. Although we showed that
transcription inhibition by five hour a-amanitin treatment had no effect on stage 10
129
amplification, we cannot conclude whether transcription is absolutely necessary for stage 10
amplification since Vm34Ca is expressed for over 12 hours, and inhibiting expression for the
duration of stages 8 to 10 is not possible in vitro. One certainty is that active transcription, by
itself, is not sufficient for amplification because homologous vitelline membrane genes (for
example, Vm26Aa, Vm26Ab, and Vm32E) are expressed at similarly high levels, yet none of
these regions are amplified (Chapter 2). Furthermore, the DAFC-34B 2.6 kb construct does not
amplify though it contains the full-length Vm34Ca gene and at least 1 kb in both 5' and 3'
regulatory sequences. Thus, DAFC-34B can be used as a model replication origin to dissect what
regulatory elements, in addition to or in place of active transcription, are responsible for origin
activity.
DAFC-34B is also a distinctive replication model because the second round of replication
initiation occurs in the absence of detectable ORC. ORC binds to the region in a 10 kb zone
during stage 10, but it is not detectable by ChIP in subsequent stages. In contrast, DAFC-62D,
which also has a late stage of replication initiation, exhibits ORC binding throughout these later
stages. Despite the absence of ORC, MCMs are loaded at DAFC-34B by stage 13. Importantly,
there is not a pool of MCMs at this region during stages 11 and 12, revealing that ORC function
in stage 10 does not load MCMs that will be activated at a later stage.
What, then, can explain the absence of ORC at DAFC-34B after stage 10? One model is
that ORC is required for late initiation but that failure to detect enrichment by ChIP is due to the
epitope being masked, specifically after stage 10. ORC may undergo a significant structural
rearrangement uniquely at DAFC-34B. Thus, the apparent absence of ORC enrichment may be,
more accurately, absence of the same ORC structure that is present at stage 10 in DAFC-34B and
for all five other amplicons at all amplification stages.
130
Conversely, ORC may not be required for late initiation at DAFC-34B, and this amplicon
may demonstrate ORC-independent MCM loading and replication initiation. There are at least
three mechanistic explanations for ORC-independent initiation. First, Park and Asano reported
that ORC is dispensable for endoreplication in Drosophila (PARK and ASANO 2008). By
generating an Orc1 null allele, the authors showed that nuclei in homozygous mutant clones
reach the same size as nuclei in wild type clones. However, cell proliferation and gene
amplification, assessed by loss of BrdU foci, were disrupted in these mutants. Asano proposed
the existence of a protein or complex X that can recruit CDC6 and promote replication initiation
(ASANO 2009). If such a factor exists, it may also play a role in late initiation at DAFC-34B.
Second, DUP/CDT1 persists in late stage egg chambers and is present at elongating replicating
forks (CLAYCOMB et al. 2002). A pool of DUP/CDT1 at the origin may recruit MCMs for late
amplification at DAFC-34B. Finally, Lydeard et al reported that break-induced replication in
yeast does not require ORC or Cdc6 (LYDEARD et al. 2010). The replication profile at DAFC34B shows that late initiation occurs at a specific developmental stage and from the same region
as early initiation. Thus, it seems unlikely that double-stranded breaks generated by collapsed
replication forks could result in the symmetrical replication profile we observe. However, if the
first round of replication initiation led to some susceptibility of the DAFC-34B origin to incur a
double-stranded break, then random strand invasion occurring from both sides of the break could
result in a symmetric replication gradient. The necessity of the first round of replication initiation
to create a double-stranded break would also explain the requirement of ORC function for both
stages of amplification.
Investigation of the developmental and replication profiles of DAFC-34B has revealed
that it is a powerful model for gaining mechanistic insight into metazoan replication initiation. It
131
shows unique properties of developmental gene expression and replication initiation in the
absence of detectable ORC, making it a tractable experimental model for studying the influence
of transcription on replication initiation and possible ORC-independent means of replication
initiation. Furthermore, analysis of multiple follicle cell amplicons highlights the diversity of
amplification control mechanisms within the same cell type and is likely to be representative of
similar regulatory diversity during S phase DNA replication.
MATERIALS AND METHODS
RNA in situ hybridization
RNA in situ hybridization with colorimetric signal output was performed as previously
described (IVANOVSKA et al. 2005). 200-800 bp exonic fragments were PCR amplified from
CG7110, CG16848, Vm34Ca, Tehao, CG6866/loqs, CG9293, CG7099, CG10859, and beta'Cop.
PCR products were cloned into the pCRII-TOPO dual promoter vector (Invitrogen). Sense and
antisense probes were in vitro transcribed and digoxygenin-labeled using either T7 or SP6
polymerase depending on the orientation of the insert, according to the manufacturer's
instructions (Roche). Ovaries from wild type OrR fattened females were dissected in Grace's
medium and hybridized at 55'C. Fluorescent hybridization for CG16956 was performed using
the same DIG-labeled probes and hybridized as previously described (XIE and ORR-WEAVER
2008). Co-localization with the slbo marker was assessed using a-GFP (a gift from Mary Lou
Pardue) immunofluorescence immediately following RNA FISH as previously described for
visualizing other proteins in follicle cells (CLAYCOMB et al. 2002).
Quantitative PCR
132
Absolute quantitative (real-time) PCR was performed as described for the DAFC-34B
replication profile and cell population experiments (CLAYCOMB et al. 2004). Standard curves
were generated from four ten-fold dilutions of stage 1-8 egg chamber DNA or 0-4h embryonic
DNA. The endogenous control was a non-amplified locus at 62C5 (CLAYCOMB et al. 2002).
Relative quantification was performed as described (XIE and ORR-WEAVER 2008). Absolute
quantification was used for the DAFC-34B replication profile and cell population experiments
whereas relative quantification was used for all other experiments.
For cell population experiments, follicle cells were isolated using a protocol modified
from Bryant et al (BRYANT et al. 1999).
-150
whole ovaries were dissected in ice-cold
Schneider's medium supplemented with 10% FBS. Tissue was digested with 0.9mL of 0.25%
Trypsin/EDTA and 0. 1ml of 5Omg/mL collagenase for 15 minutes at room temperature. The
supernatant was strained through a 40[tm mesh and spun at 1OOg for 7 minutes in the cold and
washed once with non-supplemented Grace's medium. GFP sorting was performed on a MoFlo2
at the MIT Koch Institute Flow Cytometry Core Facility. Cells were pelleted at 1OOOg for 7
minutes and processed for genomic DNA isolation as described (CLAYCOMB et al. 2002).
Nascent Strand Analysis
Nascent strand abundance analysis was performed for stage 10 egg chambers as
previously described (XIE and ORR-WEAVER 2008). Each fraction was analyzed for the
abundance of specific sequences by relative qPCR using 0-4 hour embryonic DNA as the
calibrator and a non-amplified locus at 62C5 as the endogenous control.
Chromatin Immunoprecipitation
133
ChIP-qPCR was performed on 300 staged egg chambers per experiment as described
(XIE
and ORR-WEAVER 2008). ChIP-chip was performed using four times the starting material.
All experiments were compared to input DNA. For hybridization to arrays, DNA was labeled
using Invitrogen's BioPrime Total for Agilent aCGH labeling kit. ChIP was performed with
ORC2 and MCM2-7 antibodies provided by Steve Bell. Array intensities were median
normalized across channels and smoothed by genomic windows of 1 kb using the Ringo package
in R (TOEDLING et al. 2007).
Transgenic fly construction
To test the cis requirements for amplification at DAFC-34B, we constructed transposons
with various sequences from the most amplified region of DAFC-34B flanked by suppressor of
Hairy wing binding sites (SHWBS) to control for genomic position-specific integration effects.
The 10kb and 4.5kb central amplified regions were PCR amplified from BACR06AO3 using
exTaq DNA polymerase (Takara) and primers with AscI and AvrII sites on the forward and
reverse sequences, respectively. These products were cloned into a modified PCRA vector with
AscI and AvrII sequences engineered into the multiple cloning site (Lu et al. 2001). These
plasmids are called PCRA_34B_10kb and PCRA_34B_4.5kb. These plasmids were digested
with NotI and subjected to a partial XhoI digest to transfer the 10kb and 4.5kb inserts to the NotI
and XhoI sites of Big Parent to generate BP_34B_10kb and BP_34B_4.5kb (Lu et al. 2001).
BP_34B_6kb was generated from the partial digest of PCRA_34B_10kb.
The 1kb origin mapped by nascent strand analysis was PCR amplified from BACR06AO3
using exTaq DNA polymerase (Takara) and primers with NheI sites. The product was cloned
into the original PCRA vector, generating PCRA_34B_1kb, which was subsequently cloned into
the BP vector at the NotI and XhoI sites to generate BP_34B_1kb.
134
ACKNOWLEDGEMENTS
We thank Steve Bell for ORC2 and MCM antibodies and Mary Lou Pardue for GFP
antibodies. Flies were provided by the Bloomington Stock Center.
REFERENCES
ALADJEM, M. I., 2007 Replication in context: dynamic regulation of DNA replication patterns in
metazoans. Nat Rev Genet 8: 588-600.
ASANO, M., 2009 Endoreplication: the advantage to initiating DNA replication without the
ORC? Fly (Austin) 3: 173-175.
BRYANT, Z., L. SUBRAHMANYAN, M. TWOROGER, L. LATRAY, C. R. Liu et al., 1999
Characterization of differentially expressed genes in purified Drosophila follicle cells:
toward a general strategy for cell type-specific developmental analysis. Proc Natl Acad
Sci U S A 96: 5559-5564.
CADORET, J. C., F. MEISCH, V. HASSAN-ZADEH, I. LUYTEN, C. GUILLET et al., 2008 Genomewide studies highlight indirect links between human replication origins and gene
regulation. Proc Natl Acad Sci U S A 105: 15837-15842.
CLAYCOMB, J. M., M. BENASUTTI, G. Bosco, D. D. FENGER and T. L. ORR-WEAVER, 2004 Gene
amplification as a developmental strategy: isolation of two developmental amplicons in
Drosophila. Dev Cell 6: 145-155.
CLAYCOMB, J. M., D. M. MACALPINE, J. G. EVANS, S. P. BELL and T. L. ORR-WEAVER, 2002
Visualization of replication initiation and elongation in Drosophila. J Cell Biol 159: 225236.
CLAYCOMB, J. M., and T. L. ORR-WEAVER, 2005 Developmental gene amplification: insights
into DNA replication and gene expression. Trends Genet 21: 149-162.
CVETIC, C., and J. C. WALTER, 2005 Eukaryotic origins of DNA replication: could you please be
more specific? Semin Cell Dev Biol 16: 343-353.
DIETZL, G., D. CHEN, F. SCHNORRER, K. C. Su, Y. BARINOVA et al., 2007 A genome-wide
transgenic RNAi library for conditional gene inactivation in Drosophila. Nature 448:
151-156.
GILBERT, D. M., 2004 In search of the holy replicator. Nat Rev Mol Cell Biol 5: 848-855.
HAMLIN, J. L., L. D. MESNER and P. A. DIJKWEL, 2010 A winding road to origin discovery.
Chromosome Res 18: 45-61.
HANSEN, R. S., S. THOMAS, R. SANDSTROM, T. K. CANFIELD, R. E. THURMAN et al., 2010
Sequencing newly replicated DNA reveals widespread plasticity in human replication
timing. Proc Natl Acad Sci U S A 107: 139-144.
HECK, M. M., and A. C. SPRADLING, 1990 Multiple replication origins are used during
Drosophila chorion gene amplification. J Cell Biol 110: 903-914.
IVANOVSKA, I., T. KHANDAN, T. ITO and T. L. ORR-WEAVER, 2005 A histone code in meiosis:
the histone kinase, NHK- 1, is required for proper chromosomal architecture in
Drosophila oocytes. Genes Dev 19: 2571-2582.
135
KARNANI, N., C. M. TAYLOR, A. MALHOTRA and A. DUTTA, 2010 Genomic study of replication
initiation in human chromosomes reveals the influence of transcription regulation and
chromatin structure on origin selection. Mol Biol Cell 21: 393-404.
KITSBERG, D., S. SELIG, I. KESHET and H. CEDAR, 1993 Replication structure of the human betaglobin gene domain. Nature 366: 588-590.
LANDIS, G., R. KELLEY, A. C. SPRADLING and J. TOWER, 1997 The k43 gene, required for
chorion gene amplification and diploid cell chromosome replication, encodes the
Drosophila homolog of yeast origin recognition complex subunit 2. Proc Natl Acad Sci U
S A 94: 3888-3892.
Lu, L., and J. TOWER, 1997 A transcriptional insulator element, the su(Hw) binding site, protects
a chromosomal DNA replication origin from position effects. Mol Cell Biol 17: 22022206.
Lu, L., H. ZHANG and J. TOWER, 2001 Functionally distinct, sequence-specific replicator and
origin elements are required for Drosophila chorion gene amplification. Genes Dev 15:
134-146.
LUCAS, I., A. PALAKODETI, Y. JIANG, D. J. YOUNG, N. JIANG et al., 2007 High-throughput
mapping of origins of replication in human cells. EMBO Rep 8: 770-777.
LYDEARD, J. R., Z. LIPKIN-MOORE, Y. J. SHEU, B. STILLMAN, P. M. BURGERS et al., 2010 Breakinduced replication requires all essential DNA replication factors except those specific
for pre-RC assembly. Genes Dev 24: 1133-1144.
MACALPINE, H. K., R. GORDAN, S. K. POWELL, A. J. HARTEMINK and D. M. MACALPINE, 2010
Drosophila ORC localizes to open chromatin and marks sites of cohesin complex
loading. Genome Res 20: 201-211.
MESNER, L. D., E. L. CRAWFORD and J. L. HAMLIN, 2006 Isolating apparently pure libraries of
replication origins from complex genomes. Mol Cell 21: 719-726.
MINDRINOS, M. N., L. J. SCHERER, F. J. GARCINI, H. KWAN, K. A. JACOBS et al., 1985 Isolation
and chromosomal location of putative vitelline membrane genes in Drosophila
melanogaster. Embo J 4: 147-153.
MONTELL, D. J., P. RORTH and A. C. SPRADLING, 1992 slow border cells, a locus required for a
developmentally regulated cell migration during oogenesis, encodes Drosophila C/EBP.
Cell 71: 51-62.
PARK, S. Y., and M. ASANO, 2008 The origin recognition complex is dispensable for
endoreplication in Drosophila. Proc Natl Acad Sci U S A 105: 12343-12348.
REMUS, D., E. L. BEALL and M. R. BOTCHAN, 2004 DNA topology, not DNA sequence, is a
critical determinant for Drosophila ORC-DNA binding. Embo J 23: 897-907.
SEQUEIRA-MENDES, J., R. DIAZ-URIARTE, A. APEDAILE, D. HUNTLEY, N. BROCKDORFF et al.,
2009 Transcription initiation activity sets replication origin efficiency in mammalian
cells. PLoS Genet 5: e1000446.
SPRADLING, A. C., 1981 The organization and amplification of two chromosomal domains
containing Drosophila chorion genes. Cell 27: 193-201.
TOEDLING, J., 0. SKYLAR, T. KRUEGER, J. J. FISCHER, S. SPERLING et al., 2007 Ringo--an
R/Bioconductor package for analyzing ChIP-chip readouts. BMC Bioinformatics 8: 221.
WANG, X., J. Bo, T. BRIDGES, K. D. DUGAN, T. C. PAN et al., 2006 Analysis of cell migration
using whole-genome expression profiling of migratory cells in the Drosophila ovary. Dev
Cell 10: 483-495.
136
XIE,
F., and T. L. ORR-WEAVER, 2008 Isolation of a Drosophila amplification origin
developmentally activated by transcription. Proc Natl Acad Sci U S A 105: 9651-9656.
137
Chapter Four:
Conclusions and Perspectives
138
In this thesis, we have investigated metazoan DNA replication initiation using Drosophila
follicle cell gene amplification as a model system. We have combined genomic approaches to
examine whole-genome views of ORC binding, transcription, and histone modifications with
molecular analyses of individual amplicons to uncover their developmental properties and
distinctive modes of regulation. DAFC-22B displays strain-specific amplification and is a model
replicon to study the determinants of ORC binding. DAFC-34B exhibits replication initiation in
the absence of ORC binding and provides a way to investigate ORC-independent means of
replication initiation. As genome-wide studies in cell culture identify more replication origins
and correlative relationships, this abundance of data highlights the importance of gaining
mechanistic understanding for how individual replication origins are specified and activated.
This thesis demonstrates the utility in combining genomic approaches with detailed molecular
analyses of individual replication origins, which Drosophila gene amplification uniquely offers
as an in vivo metazoan experimental model to study DNA replication. Future studies with the
follicle cell amplicons, described here, will further elucidate the extent to which different origins
are similar and diverse in terms of their regulation.
Active transcription as a causal determinant of gene amplification
This thesis has demonstrated that gene amplification does not, in all cases, promote high
transcription levels or augment transcription. The examples of DAFC-22B, where amplification
apparently has no effect on CG7337 expression levels, and DAFC-34B, where Vm34Ca is
transcribed prior to the onset of gene amplification, raise the possibility that active transcription
may promote gene amplification at these loci. There is already one example of this link in
follicle cells. At DAFC-62D active transcription is required for the late stage initiation,
specifically to load MCM complexes (XIE and ORR-WEAVER 2008). Investigating whether active
139
transcription plays a causal role in replication initiation can be readily done for DAFC-34B,
where a transformed 6 kb sequence is sufficient for amplification. Modifications can be made to
this 6 kb sequence, including deletion of the Vm34Ca promoter region, substitution of the
Vm34Ca promoter region with another sequence, or changing the orientation of the Vm34Ca
gene, to test their effects on gene amplification. Additionally, Vm34Ca can be substituted with
any of its homologous genes (Vm26Aa, Vm26Ab, or Vm32E) to determine whether something
about the Vm34Ca sequence specifically is required for amplification or merely active
transcription at a particular developmental stage. It will be important to introduce a sequence tag
to the Vm34Ca gene to distinguish the transcript of the transformed 6 kb from the endogenous
gene when assessing, for example, the effect of the promoter deletion.
Although active transcription may promote replication initiation at some genomic
regions, it is clearly not sufficient for amplification, as there are many highly expressed gene
regions that do not display a significant increase in copy number. One possibility is that active
transcription promotes ORC binding and replication initiation only when it occurs during
specific stages. Because 16C follicle cells contain a mixed population of stages 9 through 14, the
highly expressed genes that are not in amplified regions may be expressed exclusively in stage
14, though we know this is not the case for vitelline membrane genes. It is also possible that
highly expressed genes were actively transcribed prior to stage 9 and persist in follicle cells, as is
certainly the case for genes encoding components of the vitelline membrane. Because the time
span of 16C follicle cells is over 20 hours, however, we believe there must be some highly
expressed genes that were actively transcribed during amplification stages but are not amplified.
These genomic regions are also useful models to investigate what properties, in addition to active
transcription, are necessary for gene amplification.
140
Genome-wide origin mapping studies in cell culture have found that origins are
significantly proximal to or overlapping with RNAPII binding sites ((KARNANI et al. 2010;
MAcALPINE et al. 2010). It will be important to determine whether the follicle cell amplified
regions have different localization patterns or levels of RNAPII enrichment compared to nonamplified regions. ChIP-seq of RNAPII is capable of distinguishing between active transcription
and genes poised for transcription. However, one technical consideration of this experiment is
starting material. Performing ChIP using staged egg chambers allows developmental timing to be
precisely known. For example, the absence of ORC at DAFC-34B would not have been detected
without stage specific experiments. However, using hand-sorted egg chambers means that nurse
cells comprise a significant portion of the population at stage 10. Because nurse cells are not
known to amplify the DAFC regions (stage 9 egg chambers show no regions of amplification,
[Eng T, personal communication]), ChIP of replication proteins using whole egg chambers
almost certainly reflects the follicle cell population. However, nurse cells are very
transcriptionally active, so it would be difficult to assess whether RNAPII signal came from the
nurse cells or follicle cells for stage 10 egg chambers; nurse cells undergo apoptosis in stage 11.
We have not yet performed ChIP using flow sorted 16C nuclei, but this approach would ensure
pure populations of follicle cell DNA without the resolution of distinct developmental stages.
This method would also likely require amplification of ChIP DNA because of the low quantity of
starting material, a step that was not necessary by isolating enough staged egg chambers.
Combining approaches, ChIP-seq of staged egg chambers as well as 16C follicle cell DNA, may
elucidate distinctive properties of RNAPII localization that cause these highly expressed regions
not to become amplified.
141
Although there are only six genomic regions that are amplified significantly in follicle
cells (greater than 2-fold), close examination of the aCGH data of 16C follicle cells reveals
regions that show apparent low levels of amplification. For example, at 26A, which contains a
cluster of genes highly expressed in follicle cells, there is a 100 kb region of very low DNA
enrichment (maximum log2 ratio is 0.4024 at 26A) (Figure 2-3A). One possibility is that high
transcriptional activity of this region enables replication initiation to occur, but only in a subset
(50% or less) of follicle cells. Because 0.5-fold enrichment would be difficult to assess by qPCR
using individual probes, further investigation of these potential regions of replication initiation
would require aCGH experiments. It would also be important to determine if these low levels of
DNA enrichment were eliminated when RNAPII activity was reduced, which could be achieved
using transgenic flies with RNAi knock-down of RNAPII subunits or a temperature-sensitive
allele of RpII215 (MORTIN and KAUFMAN 1982).
CG7337 expression and strain-specific amplification of DAFC-22B
DAFC-22B is unique in that a single 60 kb gene is located in the most amplified region.
Surprisingly, strains that do and do not amplify the region show the same overall expression
levels of CG7337. The transcription start site of the D isoform corresponds to the most amplified
region, and when we observed higher levels of this isoform in non-amplifying OrRMOD staged
egg chambers, it was appealing to hypothesize that expression of this isoform was responsible
for inhibiting amplification. However, when we examined expression in purified follicle cells,
we did not observe a difference in D isoform expression levels between the two strains. Although
we attempted to examine the expression of CG7337 by in situ hybridization using multiple
probes and both colorimetric as well as fluorescent detection, we were not able to detect
expression in follicle cells. Because CG7337 expression may play a critical role in DAFC-22B
142
amplification, it would be important to revisit the localization of this gene by in situ
hybridization.
We identified sequence differences between the closely related non-amplifying OrRMO
and amplifying OrR TOW strains and used P element mediated transformation to introduce these
sequences into flies. We are currently waiting to recover transformants of these sequences. By
testing for ectopic amplification, we will be able to uncover what sequence differences, if any,
are responsible for differential ORC binding and amplification. Furthermore, because DAFC-22B
is the only amplicon known to bind ORC in another cell type (multiple cell culture lines), we are
also testing whether this sequence, buffered with Suppressor of Hairy-wing binding sites, will
enable gene amplification in follicle cells.
Specific histone acetylation and gene amplification
Using an amplification reporter system, we demonstrated that H4K8 acetylation is
necessary for amplification of TT1 (the minimal sequence for DAFC-66D amplification).
Although we demonstrated by ChIP-qPCR that H4K8 acetylation at the amplicons was similar to
the pattern of tetra-acetylated H4, we did not examine H4K8 acetylation on a genome-wide
scale. Tetra-acetylated H4 is enriched at many sites across the genome, and it is possible that
H4K8 specifically marks sites of replication initiation. Additionally, we can test antibodies
specific for H4K16 acetylation, as this reagent (verified for specificity) is now available.
However, genome-wide localization of H4K8 (or other histone modifications) has the same
challenges as the localization of RNAPII by ChIP; it is impossible to distinguish signal arising
from follicle cells versus nurse cells. As with RNAPII, it will be important to perform ChIP-chip
or ChIP-seq on both staged egg chambers, to retain development timing information, as well as
purified follicle cell DNA, to ensure tissue specificity of the signal. These experiments will also
143
reveal whether there are differences in H4 acetylation at the later egg chamber stages compared
to stage 10, which may influence late amplification at DAFC-34B or DAFC-62D.
Investigating ORC-independent initiation at DAFC-34B
At DAFC-34B, there is no detectable ORC after stage 10 despite a late round of
replication initiation. In Chapter 3, we proposed several models to explain this observation. First,
ORC may be required for late amplification, but it may undergo a significant structural
rearrangement uniquely at DAFC-34B that masks the ORC2 antibody epitope. An ORCI
antibody is available, and although use of this antibody has not been reported for ChIP, we can
test for developmental localization of ORCI.
Additional models posit ORC-independent mechanisms of replication initiation. To
address these models, conditional replication factors will be necessary. In Appendix 1, we report
our strategy and preliminary results toward generating conditional replication factor mutants in
ORC 1, MCM6, and DUP/CDT 1. Another model proposes that late initiation at DAFC-34B is due
to break-induced replication. In yeast, Pol32 is uniquely required for break-induced replication
(LYDEARD et al. 2007). By BLAST analysis, CG3975 is the top candidate for a Pol32 homolog
in Drosophila. There are several P element insertion lines in CG3975, and it will be important to
assess DAFC-34B amplification in these lines. Because gene amplification may produce doublestranded breaks at the elongating replication forks of other amplicons, it will be informative to
test for enrichment of y-H2Av by ChIP-chip as well as genome-wide aCGH analysis in CG3975
mutants.
Application of Drosophila genomic resources to the study of follicle gene amplification
144
In this thesis we have demonstrated that follicle cell gene amplification is a powerful in
vivo model to study DNA replication. There are many questions to address at the level of
individual amplicons. However, Drosophila offers many powerful genomic resources that can be
used to further investigate follicle cell gene amplification and uncover replication regulatory
mechanisms. There are several transgenic RNAi libraries in Drosophila. In the process of
studying DAFC-22B amplification, we discovered that this region is amplified in the genetic
background of the VDRC collection (DIETZL et al. 2007). Such a collection makes a genomewide screen for genes affecting amplification very feasible. For example, by crossing RNAi lines
to a ubiquitous follicle cell driver such as c323a, dissecting stage 13 egg chambers, and
performing qPCR using probes for the DAFC-66D origin, a region 50 kb away from the DAFC66D origin, or the DAFC-22B most amplified region, one could identify genes affecting
amplification levels, replication elongation, and possibly DAFC-22B specific replication. It may
be very informative to implement a candidate screen on groups of genes such as transcription
factors, histone modifying enzymes, and chromatin remodeling factors to test the effectiveness of
this strategy.
In addition to transgenic and molecular reagents, Drosophila has the advantage of having
12 species with genome sequences (CLARK et al. 2007). It will be informative to do comparative
analysis of the amplicon regions to potentially identify DNA elements important for
amplification based on sequence conservation. Additionally, we performed preliminary synteny
analysis of the follicle cell amplicons and discovered that there is a break in synteny at DAFC30B in D. pseudoobscurathat makes it a useful tool for studying the sequence boundaries
necessary for amplification (Appendix 2). In D. pseudoobscura,we observed that there was an
increase in copy number at stage 10 and stage 13. Consistent with late amplification, our
145
developmental analysis of ORC localization by ChIP-chip shows ORC binding in pooled stages
11 and 12 at DAFC-30B (Appendix 3). Because a complete replication profile at DAFC-30B has
not been performed, it will be important to revisit this experiment to examine whether it also
displays unique replication properties or late replication initiation.
Close molecular analysis of all six amplicons will provide the first comprehensive
analysis of all replication origins in a metazoan cell type. These studies will provide an important
picture of what properties are shared or distinct in the specification of ORC localization and
replication initiation.
REFERENCES
CLARK, A. G., M. B. EISEN, D. R. SMITH, C. M. BERGMAN, B. OLIVER et al., 2007 Evolution of
genes and genomes on the Drosophila phylogeny. Nature 450: 203-218.
DIETZL, G., D. CHEN, F. SCHNORRER, K. C. Su, Y. BARINOVA et al., 2007 A genome-wide
transgenic RNAi library for conditional gene inactivation in Drosophila. Nature 448: 151156.
KARNANI, N., C. M. TAYLOR, A. MALHOTRA and A. DUTTA, 2010 Genomic study of replication
initiation in human chromosomes reveals the influence of transcription regulation and
chromatin structure on origin selection. Mol Biol Cell 21: 393-404.
LYDEARD, J. R., S. JAIN, M. YAMAGUCHI and J. E. HABER, 2007 Break-induced replication and
telomerase-independent telomere maintenance require Pol32. Nature 448: 820-823.
MAcALPINE, H. K., R. GORDAN, S. K. POWELL, A. J. HARTEMINK and D. M. MACALPINE, 2010
Drosophila ORC localizes to open chromatin and marks sites of cohesin complex
loading. Genome Res 20: 201-211.
MORTIN, M. A., and T. C. KAUFMAN, 1982 Developmental genetics of a temperature-sensitive
RNA polymerase II mutation in Drosophila melanogaster. Mol Gen Genet 187: 120-125.
XIE, F., and T. L. ORR-WEAVER, 2008 Isolation of a Drosophila amplification origin
developmentally activated by transcription. Proc Natl Acad Sci U S A 105: 9651-9656.
146
Appendix One:
Strategy and preliminary results toward generation of conditional
replication factor mutants in Drosophila
Jane C. Kim, Wendy M. Lami, James M. Berger 2, Stephen P. Bell', and Terry L. Orr-Weaver
1 Dept. of Biology, MIT and HHMI
2 Dept. of Molecular and Cell Biology, UC Berkeley
W.L. performed the yeast experiments.
J.B. identified candidate sites for TEV protease cleavage site insertion.
J.K. performed the molecular cloning.
147
INTRODUCTION
In metazoans one challenge of studying the functions of essential genes in a temporally
specified manner is the difficulty in obtaining and generating conditional null mutants. Some
temperature-sensitive alleles have been isolated in mutagenesis screens, but these mutagenic
events are not amenable to studying specifically defined genes (GAZIOVA et al. 2004; SUZUKI et
al. 1971)
Recently, the Nasmyth lab developed an approach to generate a conditional mutant of the
SCC1/RAD21 component of the essential cohesin complex in Drosophila (PAULI et al. 2008). By
engineering a version of RAD21 with three tandem tobacco etch mosaic virsus (TEV) protease
cleavage sites that could complement a Rad21 null mutant, TEV protease expression could be
induced to inactivate RAD21 function in a cell-type specific and temporally controlled manner.
This appendix describes the strategy and initial experiments to generate conditional
mutants in Drosophila in essential replication initiation components, specifically the prereplication complex (pre-RC) proteins ORC1, MCM6, and DUP/CDT1. Having these tools will
enable the study of a number of key questions related to the developmental regulation of DNA
replication such as the requirement of ORC function for all initiation events during follicle cell
amplification, the role of DUP/CDT1 during replication elongation, and the requirement of
polyploidy in various Drosophila tissues. These applications will be described further in the
Discussion.
RESULTS
To construct conditional mutants in Drosophila, our goal was to identify positions within
the protein sequence for which addition of three tandem TEV protease cleavage sites (amino
acids ENLYFQG) would, when introduced in a transgenic fly, rescue a null mutant. Upon
148
expression of TEV protease, the engineered protein should be completely inactivated. Because
this strategy requires a null mutant and the ability to rescue this mutant, we narrowed our choice
of ORC and MCM subunit by selecting specific genes for which the criteria of null mutants and
transgenic complementation were fulfilled: Orc1 and Mcm6 (PARK and ASANO 2008; SCHWED et
al. 2002).
For dup/Cdtl, it was previously shown that expression of dup cDNA driven by the UASGAL4 system failed to rescue the sterility or lethality of any mutant combination (CLAYCOMB
2004). One explanation is that dup expression is precisely regulated and mutants can only be
rescued by a genomic construct and not via the ectopic expression inherent in the UAS-GAL4
system. We therefore decided to introduce a large BAC containing the dup genomic region into
flies using site-specific integration (VENKEN et al. 2009). The 60 kb BAC CH321-89D23
(FigureAl-1) was injected into two lines VK33 (65B2) and VK37 (22A3). These lines will
require genetic testing to determine whether they rescue dup mutant alleles.
We decided to test different insertion positions for TEV protease cleavage sites in Orc Ip,
Mcm6p, and Cdtlp first in budding yeast. Structural analysis was used to predict positions
between structured domains or unordered regions between ordered regions. The target sequences
are listed in Table Al-I by rank based on protein structure.
To test the candidate sites in budding yeast, we took the approach outlined in Figure Al2. Two unique six pair restriction sites were introduced to the target position by site-directed
mutagenesis. Three tandem TEV protease cleavage sites were synthesized with the appropriate
flanking restriction sites and cloned into the target position. The modified gene was integrated
into the LEU2 locus. For the OrcIp and Mcm6p candidates, we could use "swapper strains" to
test the ability of the modified gene to complement wild type function. These URA3- strains have
149
ORC1 and MCM6 genomic deletions with URA3+ plasmids containing the wild type gene.
Selection of URA3- clones using 5-FOA will determine whether the modified gene is functional.
For the Cdtlp candidate, we integrated the modified gene into a temperature-sensitive
strain and tested for growth at the restrictive temperature. Using this approach, we found that the
cdt1-K450 allele complemented wild type function and showed reduced growth upon galactoseinduced expression of TEV protease. The equivalent mutation in Drosophila, K543, will be
constructed as a candidate conditional replication factor mutant. The results and status of cloning
for the other candidates are summarized in Table A1-2.
DISCUSSION
We have described the strategy and initial experiments to generate conditional mutants of
Orc1, Mcm6, and dup/Cdt] in Drosophila. Once generated, these tools will enable the study of a
number of key questions related to the developmental regulation of DNA replication. We have
shown that a new follicle cell amplicon DAFC-34B displays two rounds of DNA replication
initiation separated by an elongation phase (Chapter 3). Although ORC is localized to the
amplicon during the first stage of amplification, it is absent in subsequent stages despite the later
round of origin activation, suggesting a possible ORC independent role of initiating DNA
replication. By using a conditional allele, we will specifically inactivate ORC function after the
first round of amplification to determine if, in fact, ORC is necessary for the second round.
One of the advantages of using follicle cell gene amplification in Drosophila as a model
to study metazoan DNA replication is that the process can be directly visualized. Using
immunofluorescence and detection of newly incorporated bromodeoxyuridine (BrdU), a
nucleotide analog, DUP/CDT1 was found to co-localize with elongating replication forks
(CLAYCOMB et al. 2002). Given the unique onionskin structure of amplicons, with replication
150
bubbles within replication bubbles and the possible head-to-tail collision of replication forks, it
was proposed that DUP/CDT1 might serve as a processivity factor for the MCM helicase or
facilitate continuous helicase reloading at these slow replication forks (CLAYCOMB et al. 2002).
A conditional mutant of DUP/CDT 1 will allow its inactivation after replication initiation has
occurred and permit the study of what role this pre-RC component has during replication
elongation.
Many plant and animal cells use endocycles as a developmental strategy to increase DNA
content (EDGAR and ORR-WEAVER 2001). This increase in nuclear DNA typically corresponds to
a proportional increase in cell size and increased metabolic activity. Studies in Drosophila have
implicated Notch signaling as a key pathway in regulating the mitotic-to-endocycle switch, and
several target genes have been identified (DENG et al. 2001; SUN and DENG 2005). Because
Notch signaling also plays a critical role in cell proliferation, it is difficult to use Notch
inactivation to study the direct impact of inhibiting the endocycles without perturbing other
developmental processes. Conditional replication factor mutants will allow replication to be
inhibited after the mitotic cycles have occurred and the normal cell number reached to determine
the importance of increased ploidy on cellular and tissue function. Because these conditional
mutants will employ the use of GAL4 drivers to induce protein inactivation, tissue-specific
drivers can be used to determine if increased ploidy plays differential roles in various Drosophila
tissues.
Several questions regarding ORC function during gene amplification, DUP/CDT1
function during replication elongation, and the importance of polyploidy in different Drosophila
tissues remain unstudied because of the lack of appropriate experimental tools. The Nasmyth lab
has shown that conditional mutants of a specific gene can be generated in Drosophila. Finding
151
the appropriate position in the ORC1, MCM6, and DUP/CDT 1 proteins that will permit insertion
of the TEV protease cleavage sites and be accessible to TEV protease cleavage will allow these
important questions to be addressed.
REFERENCES
CLAYCOMB, J. M., 2004 Gene Amplification in Drosophila Ovarian Follicle Cells as a
Developmental Strategy and Model for Metazoan DNA Replication, pp. 203 in Biology.
MIT, Cambridge, MA.
CLAYCOMB, J. M., D. M. MACALPINE, J. G. EVANS, S. P. BELL and T. L. ORR-WEAVER, 2002
Visualization of replication initiation and elongation in Drosophila. J Cell Biol 159: 225236.
DENG, W. M., C. ALTHAUSER and H. RUOHOLA-BAKER, 2001 Notch-Delta signaling induces a
transition from mitotic cell cycle to endocycle in Drosophila follicle cells. Development
128: 4737-4746.
EDGAR, B. A., and T. L. ORR-WEAVER, 2001 Endoreplication cell cycles: more for less. Cell 105:
297-306.
GAZIOVA, I., P. C. BONNETTE, V. C. HENRICH and M. JINDRA, 2004 Cell-autonomous roles of the
ecdysoneless gene in Drosophila development and oogenesis. Development 131: 27152725.
PARK, S. Y., and M. ASANO, 2008 The origin recognition complex is dispensable for
endoreplication in Drosophila. Proc Natl Acad Sci U S A 105: 12343-12348.
PAULI, A., F. ALTHOFF, R. A. OLIVEIRA, S. HEIDMANN, 0. SCHULDINER et al., 2008 Cell-typespecific TEV protease cleavage reveals cohesin functions in Drosophila neurons. Dev
Cell 14: 239-251.
SCHWED, G., N. MAY, Y. PECHERSKY and B. R. CALVI, 2002 Drosophila minichromosome
maintenance 6 is required for chorion gene amplification and genomic replication. Mol
Biol Cell 13: 607-620.
SUN, J., and W. M. DENG, 2005 Notch-dependent downregulation of the homeodomain gene cut
is required for the mitotic cycle/endocycle switch and cell differentiation in Drosophila
follicle cells. Development 132: 4299-4308.
SUZUKI, D. T., T. GRIGLIATTI and R. WILLIAMSON, 1971 Temperature-sensitive mutations in
Drosophila melanogaster. VII. A mutation (para-ts) causing reversible adult paralysis.
Proc Natl Acad Sci U S A 68: 890-893.
UHLMANN, F., D. WERNIC, M. A. POUPART, E. V. KOONIN and K. NASMYTH, 2000 Cleavage of
cohesin by the CD clan protease separin triggers anaphase in yeast. Cell 103: 375-386.
VENKEN, K. J., J. W. CARLSON, K. L. SCHULZE, H. PAN, Y. HE et al., 2009 Versatile P[acman]
BAC libraries for transgenesis studies in Drosophila melanogaster. Nat Methods 6: 431434.
152
......
.. ....
.....
-
Figure Al-1
60 kb BAC containing the dup genomic region
I.
112701
I
U4U~A4
na 00
11280k
na
CG34365
taic-5
CG34365-RFC~~~~~
Pies2
Pms2A-AU4
Mtk
Mtk-RI-
SRP-RCWZT
CH321-89023
153
W
~~~~
'ift
t-
CG30472
CG30472-RA-U
CG34188
cG34188-RA+-
Figure A1-2
Experimental Strategy Flow Chart
(1) Move gene region into pBluescript
(2) Introduce two unique six pair restriction sites to target
sequence by site-directed mutagenesis
SphI---NheI
(3) Clone three tandem TEV protease cleavage site
sequences (generated by custom DNA synthesis) into the
target sequence
SphI-3xTEVseq-NheI
(4) Move gene region with TEV protease cleavage sites
into integrating plasmid
ChrV
ChrXI
(5) Transform integrating plasmid into swapper strain
(Orc1, Mcm6) or temperature-sensitive strain (Cdt1)
(6) Test for loss of covering plasmid (Orc1, Mcm6) or
growth at restrictive temperature (Cdt])
Growth on 5-FOA?
(7) Test for reduced viability upon expression of TEV
protease
154
Table A1-1. Conditional replication mutant candidates.
ORC 1 Conditional Mutant Candidates
Rank
Target
Amino Acid Region
1
E290
EDEEE-DEDEE
2
L30
GGQKRL-RRRGA
2
E250
ITDNE-DGNE
2
D760
KAKDD-NDDDD
3
3
3
3
T413
N644
S700
E747
Notes
342aa N-terminal deletion viable
342aa N-terminal deletion viable
342aa N-terminal deletion viable
Completed to Step 7. Equivalent growth
with TEV protease expression
Completed to Step 2
Completed to Step 2
Completed to Step 2
Completed to Step 2
LKTT-QKHQ
KGLN-DSFF
ASVS-GDAR
YDDE-DKDL
MCM6 Conditional Mutant Candidates
Rank
Target
Amino Acid Region
1
KIO
LNHVK-KVDDV
1
S469
NIGAS-SPDAN
2
A365
IQENA-NEIPT
2
G686
ANPVG-GRYNR
Notes
N-terminal deletion viable
Completed to Step 3
Completed to Step 3
Completed to Step 3
CDT1 Conditional Mutant Candidates
Rank
Target
Amino Acid Region
1
K450
KVTQK-SSNAN
Notes
Completed to Step 7. Reduced viability with
TEV protease expression
2
S70
PDTS-QGFD
Completed to Step 7. Equivalent growth
with TEV protease expression
155
Table A1-2. Yeast Strains Used in this Study.
Strain
9127
(UHLMANN et
al. 2000)
AIAy19
(SPB)
ASY2157
(SPB)
Description
Galactoseinducible TEV
protease
OrcI swapper
E1541
Cdtl ts-allele
Mcm6 swapper
(JACOBSON et
al.
Relevant Genotype
MATu, SCCJ-HA3..H1S3, GAL-NLS-myc9-TEV-NLS2 x
J0..iRPJ
MTa. ade2-1, ura3-], his3-]1,15, trp]-], leu2-3,]]2,
1, URA3)
can-]00, orclxhisG, pSPB]6
MATa, ade2-], ura3-11, his3-]1,15, leu2-3, cani-QO, trp]MATa, ade2-1, trpl-l, canl-]00, leu2-3,112, his3-1],]S,
ura3, GAL, psi±, sid2-2] (cdt]-ts)
2001)
YWL8
YWL9
YWLJ
Ta, crc] 1)760 TEV..LEU2, orcl..hisG
cdt]-ts, GAL-NLS-myc9-TEV-NLS2 x ]0..TRPJ
GAL-NLS-myc9-TEV-NLS2 x 1O:TRP
________MA
YWLc2
YWL94
_MATa,
_MTP,
rc1D760_TEV.LEU2, orc].hisG, GAL-NLSpS452myc9-TEV-NLS2 x 10 .TRP
MATa, cdtl-ts, GAL-NLS-myc9-TEV-NLS2 x 10::TRP1,
MAT,
cdt]
K450
TEV::LEU2
REFERENCES
JACOBSON, M. D., C. X. MUNOZ, K. S. KNOX, B. E. WILLIAMS, L. L. Lu et al., 2001 Mutations in
SID2, a novel gene in Saccharomyces cerevisiae, cause synthetic lethality with sic1
deletion and may cause a defect during S phase. Genetics 159: 17-33.
UHLMANN, F., D. WERNIC, M. A. POUPART, E. V. KOONIN and K. NASMYTH, 2000 Cleavage of
cohesin by the CD clan protease separin triggers anaphase in yeast. Cell 103: 375-386.
156
Appendix Two:
Synteny analysis of the DAFC-30B amplified region
157
Gene amplification in Drosophila is a powerful model for studying metazoan DNA
replication. In addition to the genetic and molecular tools available, 12 Drosophila species have
been sequenced, enabling comparative genomic analysis of functional DNA elements (CLARK et
al. 2007). Although primarily investigated in Drosophilamelanogaster,follicle cell gene
amplification has been demonstrated in at least 14 Drosophila species as well as the
Mediterranean fruit fly Cerratitiscapitata (CALVI et al. 2007; VLACHOU et al. 1997).
We analyzed gene synteny in the six amplified genomic regions. At DAFC-30B, we
observed a break in synteny between D. melanogasterand D. pseudoobscura that prompted
closer investigation of this genomic region (Figure A2-1A). Because the break occurred in the
most amplified region, we tested whether amplification of DAFC-30B was conserved in D.
pseudoobscuraby quantitative real-time PCR. Additionally we tested whether the region distal
to the break was amplified in D. pseudoobscura.We observed stage-specific amplification of
DAFC-30B in D. pseudoobscura(Figure A2-1B). However, we did not observe amplification of
genes distal to the break (Figure A2-1C). These experiments reveal that the CG1 7855 homolog
(GA14701) is the left-most boundary of the sequence required for amplification. Furthermore,
analysis of syntenic breaks may be a useful method to define the cis requirements for
amplification in different Drosophila species.
158
ACKNOWLEDGEMENTS
We thank Matt Rasmussen and Manolis Kellis (MIT) for assistance with visualizing synteny in
Drosophila species, which led to the observations reported here.
REFERENCES
CALVI, B. R., B. A. BYRNES and A. J. KOLPAKAS, 2007 Conservation of epigenetic regulation,
ORC binding and developmental timing of DNA replication origins in the genus
Drosophila. Genetics 177: 1291-1301.
CLARK, A. G., M. B. EISEN, D. R. SMITH, C. M. BERGMAN, B. OLIVER et al., 2007 Evolution of
genes and genomes on the Drosophila phylogeny. Nature 450: 203-218.
VLACHOU, D., M. KONSOLAKI, P. P. TOLIAS, F. C. KAFATOS and K. KoMITOPOULOU, 1997 The
autosomal chorion locus of the medfly Ceratitis capitata. I. Conserved synteny,
amplification and tissue specificity but sequence divergence and altered temporal
regulation. Genetics 147: 1829-1842.
159
Figure A2-1. Analysis of gene synteny and DAFC-30B amplification in D. pseudoobscura.
In D. pseudoobscura,there is a break in synteny at DAFC-30B to the left of CG1 7855 (A). The
region to the right of the break is designated Dp DAFC-30B, and the region to the left of the
break is Dp Distal Region. (B and C) qPCR quantification of genomic DNA from egg chambers
staged as in Calvi et al. DNA copy number is quantified relative to DpAct5c, which we assume
to be non-amplified. At Dp DAFC-30B, there is stage-specific gene amplification (B). The Dp
Distal Region is not amplified (C).
160
..........
....
.
.............
.
..........
. ......
.
.........
......
:
. .. ..........
::
................
..................
.
Figure A2-1
1
95100001
50 kb
95200001
9
95300001
9
95400001
95500001
95600001
95700001
95800001
95900001
Dmn aCGH
10 CG33298
CG332
0t3
~J85
Oatp3OB
OatpO
Oatp3OB
Dm genes
I*44444$
CG31883 D
!QwLC48!
ip
INA31709
jp,
Q
C34
CG3838 '.~gcmft
CG3838
CG4300
CG4389N
Cgr3113I
CG 13114EM
D. pseudoobs cura
synteny
H
I
a
GA10336
(CG 10473)
GA25312
(CG33298)
GA14701
(CG 17855)
I.
GA 17702
(Oat3OB)
Gene Amplification in Dp DAFC-30B
GA12056
(CG 13114)
Dp DAFC-30B
Dp Distal Region
Gene Amplification in Dp Distal Region
m GA12056 (CG13114)
*
GA14701 (CGI7855)
G A10336 (CG10473)
idkm
dnf
543
2-
i
161
M
GA25312 (CG33298)
M
GA17702 (Ot30B)
Appendix Three:
Summary of follicle cell amplicons
162
.......
..
............
.....................................................
.....
.....
- - - -,
W.::--:- :-
.
........
..........
..............
Figure A3-1
Amplicon
Expressed
genes
Maximum
Amplification
Identification
Reference
DAFC-7F
Cp36, Cp38...
15-20 fold
Spradling
DAFC-22B
CG7337
4 fold
This Thesis
CG13113,
4 fold
Claycomb et
1981
DAFC-30B
CG13114...
DAFC-34B
a/ 2004
Vm34Ca,
6-8 fold
This Thesis
4 fold
Claycomb et
60-80 fold
Spradling
CG16956...
DAFC-62D
yellow-g,
yellow-g2...
DAFC-66D
a/ 2004
Cp15, Cp18...
_
DAFC-228
DAFC-348
_1981
DAFC-62D
st1OB initiation
st1OB initiation
st1OB initiation
stil initiation
stil elongation
stil elongation
st12-13 elongation
st12 final initiation
st12 elongation
st13 elongation
st13 final initiation
163
--
I..............
u
:: ::::::
''I'll",
-
-
-
=
. ...............
. ......
Figure A3-2
DAFC-30B
DAFC-22B
005
SUO 1WO55001fj3
pjI
000554006
05
2W5
5
""
5
0555w 19S
C.)
CGM7
40C6M
5-
00
0
5-
-
-1-
0c
ot
5
00
DAFC-34B0N
1
51
1
1
3000
13
0
10OW 1
1
13 00
DAFC-62D
I-
I
50005
xO
zloi
mm8
M
2mm
nOM
o
00
OU
anul*l
051
|
En m
5
,n
'
CcD
l
l
l
il
i
i
a
0;
0e'
0
-~
09
5-L
o5.
0"
lill~ilbla
.
-
.a
.
o
5-
00
o-
7.
DAFC-66D
DAFC-7F
_0
55
a8o505
870o5
175051
-AIL
s70050l
87555lsm
5
S
0
M01
5am5mi
ammOW
80M 00
0-
5I0
8ami
00m05u
1
7001pp.
i
m
ili
Illl
5 _
,
5-
0
0
5: _
o
0
:
0
5-
0
1,-
O o)
.
5 .
o
000
L
M
.-.
164
---
1.1.-".
" im'- -p
Download