Scanning the human genome with combinatorial transcription factor libraries R A

advertisement
RESEARCH ARTICLE
© 2003 Nature Publishing Group http://www.nature.com/naturebiotechnology
Scanning the human genome with combinatorial
transcription factor libraries
Pilar Blancafort, Laurent Magnenat, and Carlos F. Barbas III∗
Published online 18 February 2003; doi:10.1038/nbt794
Despite the critical importance of transcription factors in mediating gene regulation, there exists no general,
genome-wide tool that uses transcription factors to induce or silence a target gene or select for a particular
phenotype. In the strategy described here, we prepared large combinatorial libraries of artificial transcription
factors comprising three or six zinc-finger domains, and selected transcription factor–DNA interactions able
to upregulate several genes in human cells. Selected transcription factors either induced the expression of
an endothelial-specific differentiation marker, VE-cadherin, in non-endothelial cell lines or, when combined
with a repression domain, knocked down expression. Potential binding sites for a number of these transcription factors were mapped along the promoter of CDH5, the gene encoding VE-cadherin. Transcription factor
libraries represent a useful approach for studying and modulating gene function in cells and potentially in
whole organisms.
Regulatory sequences and their attendant transcription factors provide
the spatial-temporal cues that direct when, where, and to what extent a
given gene is expressed. Most regulatory sequences contain binding
sites for repertoires of transcription factors that mediate activation or
repression of target genes. Considerable efforts have been devoted to
engineering artificial sequence-specific transcription factors able to
regulate specific genes, particularly therapeutic targets1. Compared
with other approaches to studying gene function, such as RNA interference, ribozymes, or antisense RNA, that provide solely knock-down
phenotypes2,3, transcription factor–based tools can generate both lossof-function phenotypes (when the transcription factor is linked to a
repressor domain) and gain-of-function phenotypes (through linkage
to an activator domain). Nevertheless, no general genome-wide transcription-factor tools have been described4.
Current transcription factor–based strategies involve the individualized design and testing of transcription factors targeted to particular genes. Modular zinc-finger DNA-recognition domains allow the
assembly of transcription factors with predictable in vitro specificity.
Such ‘de novo’ design has been used successfully for the regulation of
a small number of genes (including ERBB2, ERBB3, VEGF, and
EPO5–8). However, rational design has not always yielded functional
regulators in vivo, mainly because knowledge of both regulatory
areas and of endogenous factors affecting transcription factor–DNA
interactions (such as chromatin structure, accessibility of the regulatory area, DNA modifications, and the presence of other cellular or
tissue-specific factors) is often very limited6,7. In the combinatorial
strategy described here, large libraries of artificial transcription factors were created and used to select in vivo protein-DNA interactions
that confer a desired phenotype or molecular function to human
cells through the activation of one or more genomic loci.
Results and discussion
Construction of zinc-finger libraries. We created libraries of zincfinger transcription factors (TFZFs) for the recognition of DNA
target sites of 9 and 18 base pairs (bp). Zinc-finger domains have
exquisite sequence specificity and modularity1. Previous studies have
identified α-helical sequences in the zinc-finger domain that confer
specific recognition of 3 bp of DNA sequence, and have shown that
these domains can be recombined to prepare polydactyl zinc-finger
proteins of desired specificity5–13. Use of characterized zinc-finger
domains allowed the prediction of potential DNA binding sites for
each TFZF after the functional screen or selection was done.
We created the 3ZF library by combinatorial assembly of three different zinc-finger repertoires (ZF1, ZF2, and ZF3). Each repertoire
consisted of an equimolar mixture of a subset of defined zinc-finger
DNA sequences encoding a characteristic α-helical element previously optimized to provide specific recognition of 3 bp of DNA
(Fig. 1A). Combination of a variety of available specific zinc-finger
recognition helices for ZF1, ZF2, and ZF3 (consisting of all the
helices recognizing DNA triplets of type GNN and a subset of the
ANN and TNN triplets8,10,12) allowed the preparation of a 9,177
member, 9 bp–targeting 3ZF library. The 3ZF library was then used
as a template to assemble the 18 bp–targeting 6ZF library (8.4 × 107
members). TFZFs were linked to a potent transcriptional activation
domain (VP64)10. We expected that the 3ZF library would recognize
a subset of genomic DNA sequences of type 5′-(NNN)3-3′, whereas
the 6ZF library would recognize a subset of genomic sequences of
type 5′-(NNN)6-3′. Given the zinc-finger domains used, both
libraries were more likely to recognize (RNN)x-type sequences
(R = G or A). In theory, the human genome contains 750 million
(RNN)3 sites (considering both strands) and 93.75 million (RNN)6
sites14. Although any (RNN)3-binding TFZF might be expected to
bind many sites in the genome, in the living cell many of these binding sites would be inaccessible or in regions with no impact on regulation. (RNN)6-binding TFZFs can bind unique sites in the genome.
Screening for upregulation of target genes in human cell lines. We
delivered millions of TFZFs into the human squamous carcinoma cell
line A431 using a retroviral vector, pMX-IRES-GFP15 (Fig. 1B).
Department of Molecular Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037. ∗Corresponding author (carlos@scripps.edu).
www.nature.com/naturebiotechnology
•
MARCH 2003
•
VOLUME 21
•
nature biotechnology
269
RESEARCH ARTICLE
© 2003 Nature Publishing Group http://www.nature.com/naturebiotechnology
A
B
transferase-4, FUT4; CD15); Apo1-FAS antigen (tumor necrosis factor superfamily member 6, TNFRSF6; CD95); integrin-α6 (ITGA6;
CD49f) and integrin-β4 (ITGB4; CD104); the
adhesion molecules CD54 (intracellular
adhesion molecule, ICAM-1) and leukocyte
function-associated antigen (LFA-3; CD58);
and the receptors erythroblastic leukemia
viral oncogene homolog-2 (ERBB2), ERBB-3,
and epidermal growth factor (EGF).
Independent selections were carried out for
each marker (Fig. 1B). These markers were
chosen because they localize on the cell surface, facilitating cell sorting, and because they
are involved in important aspects of tumor
D
E
F
C
biology such as cell proliferation, adhesion, or
migration.
After three rounds of selection with the
3ZF library and four rounds with the 6ZF
library, pools of infected A431 cells were analyzed by flow cytometry. For both libraries,
five cell surface markers showed changes in
expression levels (Fig. 1 and Supplementary
Fig. 1 online): ERBB-2 and VE-cadherin were
the most highly regulated by the 3ZF library
Figure 1. The TFZF library design. (A) The TFZF library construction based on the modular
(Fig. 1C,D), and VE-cadherin and ICAM-1
organization of protein-DNA contacts. (B) Screening for functional TFZF activators in A431 cells.
(C–F) Flow cytometric analysis of A431 cells infected with some of the selected pMX-TFZF pools
were the most highly regulated by the 6ZF
from the 3ZF selections (C, D) or 6ZF selections (E, F). Shown are upregulation of ERBB-2 (C),
library (Fig. 1E,F). The remaining three
VE-cadherin (D, E), and ICAM-1 (F). Blue, A431 cells infected with the selected pMX-TFZF pools and
markers showed only small changes in gene
stained with the corresponding antibody; orange, A431 cells infected with the 3ZF or 6ZF unselected
expression with both 3ZF and 6ZF selections.
libraries; green, mock-infected cells; stippled line, control staining without primary antibody.
Both the 3ZF (Fig. 1D) and 6ZF (Fig. 1E)
libraries induced expression of the strictly
Infection efficiency and expression of individual library members
endothelial-specific marker VE-cadherin. This marker is not signifwere tracked with the green fluorescent protein (GFP) marker. Cells
icantly expressed in A431 cells, as determined by FACS and
overexpressing a target gene product on the cell surface were selected
RT-PCR. VE-cadherin is a transmembrane glycoprotein that selfby flow cytometry. The DNA encoding the zinc-finger domain was
associates in the adherens junctions of endothelial cells, controlling
recovered by PCR and re-cloned into the retroviral vector for subsethe permeability of the endothelium16. In addition, VE-cadherin is
necessary for vascular morphogenesis17 and is involved in several
quent rounds of selection. Finally, individual TFZF clones were isolated and sequenced and their functional properties were analyzed
aspects of angiogenesis18, tumor growth, and metastasis19,20. We
focused further studies on the characterization of TFZFs activating
in vivo and in vitro. A431 cells infected with 3ZF and 6ZF libraries
its associated gene, CDH5.
were screened with monoclonal antibodies against ten different
In vitro and in vivo analysis of TFZFs regulating CDH5. The
markers: vascular endothelial cadherin (also known as VE-cadherin;
sequences of the TFZFs regulating CDH5 and their predicted
cadherin-5 type 2, CDH5; CD144); 3-FAL selectin ligand (fucosylTable 1. 6ZF (top) and 3ZF (bottom) clones activating VE-cadherin
TFZF
144-3
144-4
144-5
144-13
144-23
144-29
VE-1
VE-5
VE-8
VE-13
VE-18
ZF helicesa
F4
F6
F5
QSSSLVR
QAGHLAS
TSGELVR
QSGDLRR
DPGALVR
QAGHLAS
TSGHLVR
RSDDLVR
QLAHLRA
DPGNLVR
QLAHLRA
QAGHLAS
RSDHLTT
TSGELVR
QSGDLRR
TSGHLVR
QSSHLVR
TSGHLVR
F3
F2
F1
QSSNLVR
QAGHLAS
TSGSLVR
REDNLHT
QSSSLVR
QSSSLVR
REDNLHT
RSDKLVR
RSDKLVR
TSGSLVR
TSGHLVR
QLAHLRA
RSDKLVR
DPGNLVR
DPGNLVR
DPGHLVR
DPGHLVR
RSDKLVR
TSGNLVR
QSSNLVR
QSSNLVR
QAGHLAS
QSSHLVR
DPGALVR
QSSNLAS
DCRDLAR
QSSSLVR
QRANLRA
QSSNLVR
QRANLRA
QRANLRA
RSDNLVR
RSDDLVR
Target sitesb
Half-site 1
Half-site 2
5′-GTA GGT TGG – GAA AGA GGA-3′
5′-TGA GCG GCT – TGA GGG GTC-3′
5′-GCT AGA GCA – GTT GAC TAA-3′
5′-GCA GAC GGT – TAG GAC GCC-3′
5′-GTC AGA GGA – GTA GGC GTA-3′
5′-TGA TGA GGT – GTA GGC AAA-3
5′-TAG GGG GAA-3′
5′-GGG GAT AAA-3′
5′-GGG GAA AAA-3′
5′-GTT GAA GAG-3′
5′-GGT TGA GCG-3′
Fold act.c
8×
10×
20×
80×
79×
27×
80×
4×
30×
5×
7×
Kd (nM)d
n.d.
23
n.d.
74
n.d.
n.d.
95
n.d.
1,009
n.d.
n.d.
The DNA interacting helices are presented with the predicted 18 bp or 9 bp target site. The fold activation of the endogenous VE-cadherin gene is shown.
aZF helices are positioned in anti-parallel orientation (COOH-F6 to F1-NH2) relative to the DNA target sequence. Amino acid position –1 to +6 of each DNA recognition helix is shown.144 clones are 6ZF proteins; VE clones are 3ZF proteins.
bPredicted target DNA sequences are presented in the 5′→3′ orientation.
cFold change of expression from FACS data is determined relative to the primary unselected library (3ZF or 6ZF library).
dDissociation constant (K ) determined by gel shift assay. Data represents the average of two to four experiments.
d
270
nature biotechnology
•
VOLUME 21
•
MARCH 2003
•
www.nature.com/naturebiotechnology
RESEARCH ARTICLE
Figure 2. Specificity of isolated TFZF clones in vivo and in vitro. (A) A431
cells were infected with different pMX-TFZF (containing the VP64
activator domain), stained with ten different antibodies, and analyzed by
flow cytometry. Blue, A431 cells infected with pMX-TFZF VE-1 (a single
clone selected for VE-cadherin activation); orange, A431 cells infected
with the 3ZF unselected library; green, mock-infected cells; stippled line,
control staining without primary antibody. Genes encode: CD58,
leukocyte function-associated antigen; CDH5, VE-cadherin (CD144);
EGF, epidermal growth factor; FUT4, 3-FAL selectin ligand (CD15);
ICAM1, intracellular adhesion molecule (CD54); ERBB2, ERBB3,
erythroblastic leukemia viral oncogene homolog-2 and -3; ITGA6,
integrin-α6 (CD49f); ITGB4, integrin-β4 (CD104); TNFRSF6, Apo1-FAS
antigen (CD95). (B, C) DNA-binding ELISA of the selected –6ZF (B)
and –3ZF protein domains (C) expressed as fusions with MBP. All TFZFs
were selected for VE-cadherin upregulation except 54.3, which was
selected for ICAM-1 activation. The DNA substrates contained the 18
bp or 9 bp predicted binding site for each 6ZF or 3ZF protein,
respectively (Table 1).
© 2003 Nature Publishing Group http://www.nature.com/naturebiotechnology
A
B
C
binding sites are presented in Table 1. From a total of 48 3ZF
clones and 36 6ZF clones tested, a number of sequences were identical at the nucleotide level, indicating selective pressure for particular clones from the libraries. Some TFZFs were able to induce
strong CDH5 expression—for example, the 6ZF clone 144-13 and
the 3ZF clones VE-1 and VE-8. To test the specificity of these TFZFs
for CDH5, we delivered TFZFs into A431 cells and probed them
with antibodies specific for ten different cell surface markers. The
TFZF clones VE-1, VE-5, VE-8, VE-13, 144-4, 144-5, and 144-13
preferentially activated CDH5 compared with the other genes tested (Fig. 2 and Supplementary Fig. 2 online). The 3ZF clone VE-1
was the most specific TFZF regulator in vivo, as determined by
FACS (Fig. 2A). 3ZF proteins may be capable of binding multiple
sites in the human genome and activating, to varying degrees,
more than one gene. Depending on the application, this could be a
limitation or an advantage.
To verify that the selected TFZFs bound their predicted DNA
substrates in vitro, we expressed the zinc-finger binding domains
as C-terminal fusions with bacterial maltose-binding protein
(MBP). The DNA-binding specificity of each fusion protein was
tested by ELISA using a panel of DNA substrates (Fig. 2B,C). The
predicted DNA binding site of each TFZF was decoded from the
α-helical sequence of the corresponding zinc finger (Table 1). As
expected, the majority of the TFZFs specifically bound their predicted target site in vitro. Notably, some of the α-helices selected in
TFZFs VE-1, VE-5, and VE-8 were identical or very similar (Table
1), explaining their similar binding-site preferences (Fig. 2C).
TFZFs VE-1 and VE-8 shared two identical α-helices that interact
with the subsequence 5′-GGGGAA-3′, resulting in recognition of
the VE-1 predicted target site by both VE-1 and VE-8. The
binding-site preferences of these proteins, and in particular the
strong recognition of both VE-1 and VE-8 for the same target site,
raises the possibility that these TFZFs have been selected to bind
partially overlapping genomic sites.
To verify that the selected TFZFs were able to regulate CDH5 at
the level of transcription, we analyzed CDH5 mRNA levels of A431
cells infected with clones 144-4, 144-13, and VE-1 by RT-PCR. As a
positive control we used human umbilical endothelial cells
(HUVEC) expressing CDH5. Specific CDH5 product was detected
in A431 cells infected with the TFZF constructs, and these clones
were able to upregulate the expression of CDH5 at the level of transcription (Fig. 3A,B).
www.nature.com/naturebiotechnology
•
Next, we investigated whether or not the TFZFs were able to
directly activate the proximal human CDH5 promoter. In mice, a
promoter fragment (–2486 to +24) is sufficient to drive endothelial-specific expression of a reporter gene in transgenic animals21.
We cloned a homologous region of the human CDH5 promoter
upstream of a luciferase reporter and carried out transactivation
studies using TFZFs in transient transfection assays of A431 cells
(Fig. 4). Only TFZFs VE-1, VE-5, and VE-8 strongly activated the
CDH5 promoter (up to 200-fold; Fig. 4A and Supplementary Fig. 3
online). We mapped the VE-1, VE-5, and VE-8 response elements in
the CDH5 promoter using serial deletions of the promoter.
Important transactivation determinants of VE-1 were located
between positions –2369 and –1861, whereas VE-5 and VE-8
responded significantly to elements located between nucleotides
–1861 and –1342 (Fig. 4A). In addition, both VE-1 and VE-8 (but
not VE-5) activated the proximal (–403 to +80) fragment of the
CDH5 promoter 10–15-fold.
Promoter regions associated with luciferase activation correlated
with TFZF binding in vitro (Fig. 4B). We localized a putative VE-1 and
VE-8 binding site between positions –88 and +80 of the proximal
A
B
Figure 3. Semiquantitative RT-PCR analysis of A431 cells infected with
several pMX-TFZF selected for CDH5 activation. (A) RT-PCR analysis of
CDH5 expression in these infected cells (clones 144-4, 144-13, and
VE-1). HUVEC cells, which express CDH5, were used as a positive
control. A431, mock-infected cells; –, control experiment in absence of
cDNA. (B) Relative CDH5 mRNA levels were normalized to TFZF
expression using VP64-specific primers. Equal loading was controlled
using GAPDH-specific primers.
MARCH 2003
•
VOLUME 21
•
nature biotechnology
271
RESEARCH ARTICLE
© 2003 Nature Publishing Group http://www.nature.com/naturebiotechnology
A
C
D
endothelial cells23. Our experiments
showed that VE-1 and VE-8 are able
to activate the proximal (–88 to +80)
human CDH5 promoter fragment
10–15-fold through interaction with
a single EBS, but activation of the
reporter was enhanced up to 200fold through interaction with distal
sequences located between positions
–2369 and –1342. In mice, the proximal CDH5 promoter (–139 to +24) is
responsible for ubiquitous transcription, but upstream sequences are
necessary to silence the activity of the
F
B
basic promoter in non-endothelial
cells21. It is possible that the TFZFs
could interfere with the silencing of
the CDH5 promoter between positions –2369 and –1342, resulting
in an enhanced transactivation.
Examination of the CDH5 promoter
E
showed several potential VE-1, VE-5,
and VE-8 binding sites in that region.
Next, we focused on TFZF VE-1 to
study zinc finger–binding determinants along this distal promoter
area. To determine the binding-site
G
preferences of VE-1, we carried out
in vitro DNA selection experiments
(cyclic amplification of selected targets (CAST) assay) using a randomized 10 bp DNA library and purified
Figure 4. Interactions of TFZFs VE-1, VE-5, and VE-8 with the CDH5 promoter. (A) Luciferase transactivation
VE-1 protein. After four rounds
assay of VE-1, VE-5, and VE-8 with several 5′ deletions of the CDH5 promoter in A431 cells. (B) DNA-binding
of DNA selection, all the analyzed
ELISA of several promoter fragments with the TFZFs VE-1, VE-5, and VE-8 purified as a fusion with MBP.
selected targets contained a 7 bp
Promoter fragments (boxes) were amplified by PCR using 5′-biotinylated primers. The binding of each
fragment was normalized and expressed as percentage of the highest value. Binding data was represented
invariable
consensus
core,
in a color gradient (higher binding corresponds to darker boxes). (C) DNA-binding ELISA of VE-1, VE-5, and
5′-AGGGGGA-3′
(Fig.
4G).
VE-8 proteins with the DNA duplex pr–88 and with the mutant pr–88(G4→T4). (D) Luciferase transactivation
Positions
1
and
9
flanking
this
core
assay of VE-1, VE-5, and VE-8 with the proximal –88 CDH5 promoter fragment and the same fragment
tolerated nucleotide variations.
containing a point mutation (G4→T4). (E) Summary of putative interactions between VE-1 and CDH5 promoter
fragments. Open boxes, potential binding sites for VE-1 as determined in vitro; underlining, putative EBS. The
Indeed, nucleotide 1 is the partner
sequence of the –88 bp proximal human CDH5 promoter and the point mutation G→T introduced for
of Thr+6, located in the α-helical
transactivation studies are indicated. (F) Interaction of VE-1 with several potential binding sites located in the
region of VE-1 ZF3. As in the case of
CDH5 promoter. The Kd (±s.d.) of VE-1 with its predicted DNA substrate (VE-1 subs) was determined by gel
shift assay. Kd values for VE-1 with promoter DNA duplexes containing potential VE-1 binding sites (comprising Zif268 (ref. 25), Thr+6 is not
the 9 bp putative interacting sequence and three flanking base pairs) were determined by ELISA and
expected to make specific hydrogen
normalized to VE-1 subs. The positions of the potential binding sites relative to the transcription start site are
bonds and therefore could not
indicated. Nucleotides that differ from the theoretical VE-1 binding site (VE-1 subs) are indicated in red.
unambiguously discriminate its tar(G) DNA sequences selected in vitro from a randomized DNA library (N10) for its interaction with VE-1 by
get nucleotide. Nucleotide 9 is a tarCAST assay. The number of sequences containing identical VE-1 binding site is indicated. Open box,
invariable nucleotides (consensus).
get of Gln–1 in the α-helix of ZF1.
Although Gln–1 in this particular
zinc finger prefers A at position
CDH5 promoter (the pr –88 duplex, 5′-CAGG4GGGAA-3′) that
3′ of the triplet, it can also tolerate T, C, or G, as reported for the
matched 8 of 9 bp of the predicted VE-1 binding site (Fig. 4E).
same GAA-binding zinc finger of a Zif268 variant12. Figure 4F
shows an alignment of the potential VE-1 binding sites found in
Indeed, both VE-1 and VE-8 interacted specifically with this duplex
the distal CDH5 promoter between positions –2369 and –1342. In
in vitro (Fig. 4C). A single mutation in this duplex (G4→T4) completely disrupted its interactions with VE-1 and VE-8. A promoter
vitro binding data showed that three DNA sequences in this region
fragment (–88 to +80) containing this sequence retained VE-1- and
interacted with VE-1 with an affinity similar to those of the preVE-8-mediated transactivation, whereas the fragment bearing the
dicted VE-1 substrate and the –88 duplex (duplexes –2303, –1990,
point mutation was unresponsive to the transcription factors (Fig.
and –1591). In agreement with the CAST data, these duplexes have
4D). The sequence 5′-GGAA-3′ (ETS-binding site-2, EBS2) is conan identical core but different nucleotides at positions 1 and 9. As
served between mouse and the human promoters, and in the mouse
expected, mutations in the conserved core all decreased the affiniit interacts with the ETS-1 protein, a transcription factor of the ETS
ty of VE-1 for its target DNA duplex. Overall, these data suggest
family expressed in endothelial cells during blood vessel formathat a possible mechanism of activation by TFZF VE-1 involves
direct regulation of the promoter by interaction with multiple
tion22–24. In the mouse proximal promoter, Ets-1 binds to two neighboring GGAA sites (EBS2 and EBS4) and activates CDH5 in
binding sites in both the proximal and distal regions.
272
nature biotechnology
•
VOLUME 21
•
MARCH 2003
•
www.nature.com/naturebiotechnology
© 2003 Nature Publishing Group http://www.nature.com/naturebiotechnology
RESEARCH ARTICLE
A
B
C
D
E
F
Figure 5. Regulation of CDH5 by TFZFs in several human cancer cell lines.
Blue, cells infected with a pMX construct containing the DNA binding domain
of VE-1 (A–E) or VE-8 (F) and the VP64 activator domain, stained with antiCD144 and analyzed by FACS. Red, cells infected with the pMX vector
containing the same DNA-binding domain but linked to the KRAB repression
domain (SKD), and stained with anti-CD144 (anti-VE-cadherin). Green, level
of VE-cadherin expressed on mock-infected cells; stippled lines, cells stained
in the absence of primary antibody.
Many TFZFs activated CDH5 in cancer cell lines where the gene
product was not significantly expressed as determined by FACS, such
as A431 (squamous carcinoma), HeLa, MDA-MB-435s (breast cancer), and HT29 (colon cancer) cells (Fig. 5). Notably, some regulators
activated (when linked to a VP64 activator domain) or repressed
(when linked to the KRAB repression domain5) CDH5 expression in
cell lines where the gene is well expressed, such as in melanoma
C8161 (Fig. 5B) or SKBR-3 cells (Fig. 5C). In melanoma C8161 cells,
expression of CDH5 has been associated with the formation of
vascular-like networks in three-dimensional collagen gels26. The
selected TFZFs could be useful tools for studying the role of CDH5
with respect to several aspects of angiogenesis, tumor progression,
and metastasis by these different cancer cell lines.
Among all the TFZFs tested, the promoter-binding TFZFs were able
to regulate CDH5 in all cell lines tested, as expected for direct regulation of the promoter. Those TFZFs that did not transactivate the
promoter in the reporter assay (such as 144-4, 144-5, and 144-13)
showed different activation profiles that varied depending on the
cell line examined. Some of these TFZFs could bind regulatory
regions located in the large 5′ introns of CDH5, or even regulatory
regions of upstream genes, perhaps encoding tissue-specific factors
involved in controlling CDH5 expression. Candidates for these indirect targets include some members of the ETS family, including
ETS1, ERG, and FLI1 (refs. 23, 24). However, database searches
showed that at most 14 of 18 bp within these regions had identity to
predicted TFZF targets. A search for 6ZF-binding sites in the human
genome identified target sites matching between 13 and 18 bp
(see Supplementary Tables 1 and 2 online). Within the CDH5 locus,
13–14 bp matches were identified. Although further investigation is
required to understand their in vivo significance, these results suggest that 6ZF proteins could use a subset of the 18 bp sites to interact
with genomic sites.
In summary, we present a method to identify functional DNAprotein interactions involved in the activation of target genes in
human cells by screening large combinatorial libraries of TFZFs. We
characterized clones selected from 3ZF and 6ZF libraries that were
able to induce an endothelial specific marker, VE-cadherin, in a
non-endothelial cancer cell line A431. A population of selected
TFZFs was able to directly transactivate the CDH5 promoter by
www.nature.com/naturebiotechnology
•
binding both a proximal and a distal promoter region. In addition,
we showed that these TFZFs could regulate their target gene in a
variety of human cancer cell lines. The advantages of libraries of
small TFZF, such as 3ZF libraries, include high representation of
individual members and the possibility of binding multiple sites in
one or more regulatory regions, a mode of regulation analogous to
the action of natural transcription factors. Highly complex
libraries of the 6ZF type have low representation of each individual
TFZF clone but potentially higher specificity. These TFZFs could recognize low-frequency, potentially unique sites that are sufficient to
activate or repress the target gene. Used in combination with current technologies such as DNA microarrays and chromatin
immunoprecipitations, they could be useful for identifying genes
and defining pathways. Recent studies in transgenic tobacco and
Arabidopsis thaliana plants indicate that zinc-finger technology can
be applied to whole organisms27,28. Thus, this methodology represents a genetic tool for the selection or screening of gain-offunction and loss-of-function phenotypes at the level of the cell or
organism based on direct gene regulation or on more complex
changes in transcriptional programs.
Experimental protocol
Construction of TFZF libraries. The 3ZF library was created by overlapping
PCR using 23 different ZF1s, 21 ZF2s, and 19 ZF3s mixed into the PCR reaction (see Supplementary Experimental Protocol online). All DNAs used as
templates for PCR were SP1 variants containing specific zinc-finger α-helices
selected and characterized in our laboratory8,10,12. These templates were
cloned and sequenced in pMalc2 (New England Biolabs, Beverly, MA). The
final (F1 + F2 + F3) PCR product was digested with SfII and cloned in the
pComb3X vector29. The resulting pComb3X-3ZF library vector was used to
construct the 6ZF library as follows. First, 10 µg of pComb3X-3ZF library
vector was digested with AgeI and NheI and ligated with 3 µg of XmaI- and
NheI-digested inserts to generate the pComb3X-6ZF library vector. Both 3ZF
and 6ZF library inserts were digested with SfII and subcloned into the retroviral vector pMX-IRES-GFP, containing the VP64 activation domain5. The
final sizes of the 3ZF and 6ZF libraries in the retroviral vector were 3.52 × 105
and 5.3 × 107, respectively.
Screening for functional TFZF activators in A431 cells and flow cytometry.
The pMX-IRES-GFP-3ZF library and pMX-IRES-GFP-6ZF library DNAs
were transfected into 293 packaging cells5 using Lipofectamine Plus
(Invitrogen, Carlsbad, CA) according to the manufacturer’s directions. The
product retroviral particles were used to infect 5 × 105 (3ZF library) or 108
(6ZF library) A431 cells. At 48 h after infection, these cells were stained with
ten different primary antibodies (5 µg/ml) specific for different cell surface
markers: anti-CD15 (clone 2F3; BD, PharMingen, San Diego, CA), antiERBB-2 (clone SP77; ref. 5), anti-ERBB-3 (clone SPG1, NeoMarkers,
Fremont, CA), anti-CD104 (clone 450–9D), anti-CD144 (clone 55–7H1,
PharMingen), anti-CD54 (clone HA58, PharMingen), anti-CD58 (clone 1C3,
PharMingen), anti-CD95 (Clone DX2, PharMingen), anti-EGF (Santa Cruz
Biotechnology, Santa Cruz, CA), anti-CD49f (clone GoH3, PharMingen) and
secondary antibodies conjugated to phycoerythrin (PE, 1:100 dilution,
Jackson ImmunoResearch, West Grove, PA). Next, 5 × 105 to 106 GFP+PE+
infected cells (3ZF library) or 107 GFP+PE+ infected cells (6ZF library) were
sorted using a FACSVantage (BD, PharMingen), and the DNA encoding the
pool of TFZFs was recovered by PCR using the primers pMXf2 (forward)
5′-TCAAAGTAGACGGCATCG-3′
and
VP64AscB
(backward)
5′-TCGTCCAGCGCGCGTCGGCGCG-3′, and cloned again into the pMX
vector. PCR was typically carried out using 50 ng–1 µg of genomic DNA and
a program of 1 cycle of 5 min at 94 °C; 35 cycles of 30 s at 94 °C, 2 min at
52 °C and 2 min (3ZF library) or 3 min (6ZF library) at 72 °C cycles; and a
final cycle of 10 min at 72 °C. Independent selections were done for each cellsurface marker. The selections were repeated for three (3ZF library) and four
rounds (6ZF library). DNA from individual clones was prepared and used to
prepare virus to infect A431 cells. These cells were analyzed by flow cytometry
using ten different antibodies as described above. For downregulation analysis, zinc fingers were subcloned into pMX-IRES-GFP-SKD vector (containing
the KRAB repression domain, SKD; ref. 5) and infections were carried out as
described above. The cell lines A431, HeLa, and SKBR-3 were cultured as
MARCH 2003
•
VOLUME 21
•
nature biotechnology
273
RESEARCH ARTICLE
© 2003 Nature Publishing Group http://www.nature.com/naturebiotechnology
described5, cell line MDA-MB-435s was obtained from the American Type
Culture Collection (Manassas, VA), and cell lines C8161 and HT29 were a
generous gift from R.A. Reisfeld of the Scripps Research Institute.
RNA extraction and RT-PCR. RNA from A431-infected cells and HUVEC cells
(Clonetics, San Diego, CA) were extracted with the Tri reagent method (MRC,
Cincinnati, OH). cDNA was made using a RT-PCR kit (Invitrogen, Carlsbad,
CA). PCR was made using CDH5-specific primers25: VE-CAD-f (forward) 5′CCGGCGCCAAAAGAGAGA-3′ and VE-CAD-b (backward) 5′-CTCCTTTTCCTTCAGCTGAAGTGGT-3′. Expression of GAPDH (encoding glyceraldehyde-3-phosphate dehydrogenase) was measured as a loading control
using the primers GAPDH-f (forward) 5′-CCATGTTCGTCATGGGTGTGA-3′
and GAPDH-b (backward) 5′-CATGGACTGTGGTCATGAGT-3′. CDH5
mRNA levels were normalized relative to TFZFs using primers NLSseq-F (forward) 5′-CCGAAAAAGAAACGCAAAGTTGGG-3′ and pMXB (backward)
5′-CAGAATTTCGACCACTGTGC-3′, which amplify VP64. PCR conditions
were 1 cycle of 3 min at 94 °C; 20–30 cycles of 1 min at 94 °C, 2.5 min at 52 °C,
and 2 min at 72 °C; and 1 cycle of 5 min at 72 °C. PCR products were visualized
in a 1% (CDH5) or 1.5% agarose gel (GAPDH) and quantified using
ImageQuant 1.2. The 1-kbp CDH5-specific PCR product was sequenced and
shown to correspond to the expected CDH5 sequence.
Luciferase assays. The human CDH5 promoter fragment (–2486 to +24) was
amplified from A431 cells by PCR using the primers cdh5pro-f3 (forward) 5′GAGGAGGAGGAGGAGGGTACCGGGGCCCAAGAAATCTGCATATTC-3′
and cdh5pro-b2 (backward) 5′-GAGGAGGAGGAGGAGAGATCTTGTTTCTGTTCCGTTGGACTGC-3′). The products were sequenced and cloned into
pGL3basic (Promega, Madison, WI). Next, 100 ng of reporter construct, 75 ng
1. Beerli, R.R. & Barbas, C.F. III. Engineering polydactyl zinc-finger transcription factors. Nat. Biotechnol. 12, 632–641 (2002).
2. Brummelkamp, T.R., Benards, R. & Agami, R. A system for stable expression of
short interfering RNAs in mammalian cells. Science 296, 550–553 (2002).
3. Hiroaki, K., Onuki, R., Suyama, E. & Taira, K. Identification of genes that function
in the TNF-α-mediated apoptotic pathway using randomized hybrid ribozyme
libraries. Nat. Biotechnol. 20, 376–380 (2002).
4. Walden, R. et al. Activation tagging: a means of isolating genes implicated as playing a role in plant growth and development. Plant Mol. Biol. 26, 1521–1528 (1994).
5. Beerli, R.R., Dreier, B. & Barbas, C.F. III. Positive and negative regulation of
endogenous genes by designed transcription factors. Proc. Natl. Acad. USA 97,
1495–1500 (2000).
6. Zhang, L. et al. Synthetic zinc finger transcription factor action at an endogenous
chromosomal site. Activation of the human erythropoietin gene. J. Biol. Chem.
275, 33850–33860 (2000).
7. Liu, P.Q. et al. Regulation of an endogenous locus using a panel of designed zinc
finger proteins targeted to accessible chromatin regions. Activation of vascular
endothelial growth factor A. J. Biol. Chem. 276, 11323–11334 (2001).
8. Dreier, B., Beerli, R.R., Segal, D.J., Flippin, J.D. & Barbas, C.F. III. Development of
zinc finger domains for recognition of the 5′-ANN-3′ family of DNA sequences and
their use in the construction of artificial transcription factors. J. Biol. Chem. 276,
29466–29478 (2001).
9. Jamieson, A.C., Kim, S.H. & Wells, J.A. In vitro selection of zinc fingers with
altered DNA-binding specificity. Biochemistry 33, 5689–5695 (1994).
10. Segal, D.J., Dreier, B., Beerli, R.R & Barbas, C.F. III. Toward controlling gene
expression at will: selection and design of zinc finger domains recognizing each of
the 5′-GNN-3′ DNA target sequences. Proc. Natl. Acad. USA 96, 2758–2763
(1999).
11. Beerli, R.R., Segal, D.J., Dreier, B. & Barbas, C.F. III. Toward controlling gene
expression at will: specific regulation of the erbB-2/HER-2 promoter by using polydactyl zinc finger proteins constructed from modular building blocks. Proc. Natl.
Acad. USA 95, 14628–14633 (1998).
12. Dreier, B., Segal, D.J. & Barbas, C.F III. Insights into the molecular recognition of
the 5′-GNN-3′ family of DNA sequences by zinc finger domains. J. Mol. Biol. 303,
489–502 (2000).
13. Liu, Q., Xia, Z. & Case, C.C. Validated zinc finger protein designs for all 16 GNN
DNA triplet targets. J. Biol. Chem. 277, 3850–3856 (2002).
14. Venter, J.C. et al. The sequence of the human genome. Science 291, 1304–1351
(2001).
15. Liu, X., Sun, Y., Constantinescu, S.N, Karam, E., Weinberg, R.A. & Lodish, H.F.
Transforming growth factor β-induced phosphorylation of Smad3 is required for
growth inhibition and transcriptional induction in epithelial cells. Proc. Natl. Acad.
Sci. USA, 94, 10669–10674 (1997).
274
nature biotechnology
•
VOLUME 21
•
of TFZF cloned in pcDNA3 (Invitrogen), and 100 ng of CMV-LacZ reporter
were transiently cotransfected in A431 cells. Luciferase activities were measured
using a luciferase reporter assay system (Promega). Transfection efficiencies
were normalized with the β-galactosidase reporter system (Galacto-Light Plus
kit; Tropix, Bedford, MA). Data represent the average of 6–12 experiments.
Point mutations in the promoter were introduced by PCR using high-fidelity
enzyme (Roche, Indianapolis, IN) and verified by DNA sequencing.
In vitro analysis of TFZF binding, mobility-shift experiments, and CAST.
These assays were done as described previously11,30 (see Supplementary
Experimental Protocol online).
Note: Supplementary information is available on the Nature Biotechnology
website.
Acknowledgments
The authors thank D. Valente and N. Niederberger for technical support, and
D.J. Segal and X. Li for the critical reading of the manuscript. This work was
supported by the US National Institutes of Health CA86258 and DK61803.
L. Magnenat was the recipient of postdoctoral fellowships from the Swiss
National Science Foundation.
Competing interests statement
The authors declare that they have competing financial interests: see the Nature
Biotechnology website (http://www.nature.com/naturebiotechnology) for
details.
Received 16 September 2002; accepted 3 January 2003
16. Dejana, E., Bazzoni, G. & Lampugnani, M.G. Vascular endothelial (VE)–cadherin:
only an intercellular glue? Exp. Cell Res. 252, 13–19 (1999).
17. Vittet, D., Buchou, T., Schweitzer, A., Dejana, E. & Hubert, P. Targeted null-mutation in the vascular endothelial-cadherin gene impairs the organization of vascular-like structures in embryoid bodies. Proc. Natl. Acad. USA 94, 6273–6278
(1997).
18. Carmeliet, P. et al. Targeted deficiency or cytosolic truncation of the VE-cadherin
gene in mice impairs VEGF-mediated endothelial survival and angiogenesis. Cell
98, 147–157 (1999).
19. Liao, F. et al. Monoclonal antibody to vascular endothelial–cadherin is a potent
inhibitor of angiogenesis, tumor growth, and metastasis. Cancer Res. 60,
6805–6810 (2000).
20. Liao, F. et al. Selective targeting of angiogenic tumor vasculature by vascular
endothelial–cadherin antibody inhibits tumor growth without affecting vascular
permeability. Cancer Res. 62, 2567–2575 (2002).
21. Gory, S., Vernet, M., Laurent, M., Dejana, E., Dalmon, J. & Huber, P. The vascular
endothelial–cadherin promoter directs endothelial-specific expression in transgenic mice. Blood 93, 184–192 (1999).
22. Gory, S. et al. Requirement of a GT box (Sp1 site) and two Ets binding sites for
vascular endothelial cadherin gene transcription. J. Biol. Chem. 273, 6750–6755
(1998).
23. Lelievre, E., Mattot, V., Huber, P., Vandenbunder, B. & Soncin, F. ETS1 lowers capillary endothelial cell density at confluence and induces the expression of VE-cadherin. Oncogene 19, 2438–2446 (2000).
24. Lelievre, E., Lionneton, F., Mattot, V., Spruyt, N. & Soncin, F. Ets-1 regulates fli-1
expression in endothelial cells. Identification of ETS binding sites in the fli-1 gene
promoter. J. Biol. Chem. 277, 25143–25151 (2002).
25. Elrod-Erickson, M., Rould, M.A., Nekludova, L. & Pabo, C.O. Zif268 protein-DNA
complex refined at 1.6 Α: a model system for understanding zinc finger-DNA interactions. Structure 4, 1171–1180 (1996).
26. Hendrix, M.J.C. et al. Expression and functional significance of VE-cadherin in
aggressive human melanoma cells: role in vasculogenic mimicry. Proc. Natl.
Acad. USA 94, 8018-8023 (2001).
27. Ordiz, M.I., Barbas, C.F III & Beachy, R.N. Regulation of transgene expression in
plants with polydactyl zinc finger transcription factors. Proc. Natl. Acad. USA
99,13290–13295 (2002).
28. Guan, X. et al. Heritable endogenous gene regulation in plants with designed
polydactyl zinc finger transcription factors. Proc. Natl. Acad. Sci. USA 99,
13296–13301 (2002).
29. Barbas, C.F III, Burton, D.R., Scott, J.K., Silverman, G.J. Phage-display vectors. in
Phage Display: A Laboratory Manual 2.1–2.19 (CSH, New York, 2001).
30. Segal, D.J. et al. Evaluation of a modular strategy for the construction of novel
polydactyl zinc finger DNA-binding proteins. Biochemistry (in press).
MARCH 2003
•
www.nature.com/naturebiotechnology
Download