Uploaded by Leila Jabbour

Sequence gene structure and expression CTNNBL1

Available online at www.sciencedirect.com
R
Genomics 81 (2003) 292–303
www.elsevier.com/locate/ygeno
Sequence, gene structure, and expression pattern of CTNNBL1, a
minor-class intron-containing gene— evidence for a role in apoptosis夞
Leila Jabbour,a Jean F. Welter,b John Kollar,b and Thomas M. Heringa,b,*
b
a
Department of Anatomy, Case Western Reserve University, Cleveland, OH 44106, USA
Department of Orthopaedics, Case Western Reserve University, Cleveland, OH 44106, USA
Received 9 May 2002; accepted 6 December 2002
Abstract
We have identified and characterized a cDNA designated CTNNBL1 (catenin (cadherin-associated protein), ␤-like 1) coding for a protein
of 563 amino acids having predicted structural homology to ␤-catenin and other armadillo (arm) family proteins. CTNNBL1 is expressed
in multiple human tissues, and its sequence is conserved across widely divergent species. The human CTNNBL1 gene on chromosome
20q11.2 contains 16 exons spanning ⬎ 178 kb. Intron 4 is a minor-class intron bearing AT at the 5⬘ splice site and AC at the 3⬘ splice site.
An acidic domain, as well as a putative bipartite nuclear localization signal, a nuclear export signal, a leucine-isoleucine zipper, and
phosphorylation motifs are present in the protein sequence. Transient expression of CTNNBL1 in CHO cells results in localization to the
nucleus and apoptosis. The rate of cell death was higher when cells were transfected with a carboxy-terminal fragment of CTNNBL1,
suggesting that the apoptosis-inducing activity is a function of this region.
© 2003 Elsevier Science (USA). All rights reserved.
Keywords: CTNNBL1; ␤-catenin; Armadillo; Arm; Chromosome 20q11.2; Minor-class intron; U12-dependent intron; Apoptosis
Introduction
A number of proteins with diverse functions are built of
arm (armadillo) repeats, a 42-amino acid structural unit
composed of three ␣-helices, including a short helix and two
longer helices [1,2]. In arm family proteins, consecutive arm
repeats are arrayed to form a right-handed superhelix of
␣-helices. The conformations of arm motifs are very similar
to each other, but their amino acid sequences are highly
夞 Sequence data from this article have been deposited with the GenBank Data Libraries under accession numbers as follows: Homo sapiens
CTNNBL1: AF239607, AL109964, AL023804, AL118499. Mus musculus
CTNNBL1: AY009405. Caenorhabditis elegans CTNNBL1: AAB37831,
U80450. Drosophila melanogaster CTNNBL1: AE003681, AAF54309.
Schizosaccharomyces pombe CTNNBL1: CAB52570. Arabidopsis thaliana CTNNBL1: AAF32478. Danio rerio CTNNBL1 (ESTs): BI883368,
AI584702, BM036242, AI794469, BI881598, AI794082, AI883274,
AI584203, BI887925, BI881994, BI883314. Bos taurus CTNNBL1:
AF037349.
* Corresponding author. Fax: ⫹1-216-368-1332.
E-mail address: tmh@po.cwru.edu (T.M. Hering)
variable. An extensive surface groove formed by the array
of ␣- helices in ␤-catenin and other arm family proteins
provides a surface for protein-protein interactions [3].
Arm motifs are found in a number of proteins with
diverse roles, many of which are involved in cadherinmediated adhesion. Arm motifs were originally identified
in the Drosophila melanogaster segment polarity gene
armadillo [4], a component of the multiprotein adherens
junction (AJ) complex. Further work revealed armadillo’s vertebrate homolog ␤-catenin [5], a multifunctional protein, combining features of a structural component of cell-cell junctions with those of a transcription
factor. In the AJ complex, ␤-catenin functions as a bridge
to connect E-cadherin with ␣-catenin, which subsequently associates with actin filaments [6]. Nonjunctional
␤-catenin is rapidly degraded by the ubiquitin-proteasome system [7]. Wnt signaling-mediated stabilization of
␤-catenin results in nuclear accumulation and complex
formation with lymphoid-enhancer factor 1 T-cell factor
1 (LEF/TCF) transcription factors, and subsequent acti-
0888-7543/03/$ – see front matter © 2003 Elsevier Science (USA). All rights reserved.
doi:10.1016/S0888-7543(02)00038-1
L. Jabbour et al. / Genomics 81 (2003) 292–303
vation of LEF/TCF target genes [8]. Adenomatous polyposis coli (APC) is an arm-motif protein [9] that functions as a negative regulator of ␤-catenin signaling [10].
Plakoglobin [11], a close relative of armadillo/␤-catenin,
is functionally similar to ␤-catenin in AJs but has different functions in desmosomes [12]. p120ctn is another
arm-domain protein that apparently has both positive and
negative effects upon cadherin-mediated adhesion.
p120ctn was originally identified as a substrate of the Src
oncoprotein [13] and was later shown to interact directly
with cadherins [14]. Unlike ␤-catenin, however, p120ctn
does not interact with ␣-catenin or with APC [15].
Exemplifying the diverse roles played by arm proteins,
karyopherin ␣ (Kap␣) functions in nuclear import and SmgGDS is involved in guanine nucleotide exchange. Kap␣ has
hydrophilic amino- and carboxy-terminal regions flanking a
central domain consisting of tandem arm repeats [16]. Kap␣
functions as a heterodimer with a second subunit termed
Kap␤ [17], interacting through the N-terminal region of
Kap␣ [18]. There is evidence that the arm-repeat domain of
Kap␣ contains the binding site for nuclear localization sequences (NLSs) of proteins destined for nuclear import [19].
The yeast Kap␣ ortholog Srp1p (SRP1) was originally identified as a suppressor of RNA polymerase I mutations in
Saccharomyces cerevisiae [20]. This evidence has led to the
suggestion that Srp1p may carry out a critical step in the
assembly of RNA polymerase I by mediating nuclear import
of polymerase subunits [17]. More recent work indicates
that Srp1p may function in regulation of protein degradation
through the ubiquitin-proteasome system [21]. SmgGDS
has been shown to be an exchange factor for Ras-related
small G proteins [22]. The arm repeats of SmgGDS compose nearly the entire protein [1], and thus are likely to play
a role in the guanine nucleotide exchange activity. Rho
proteins, small GTPases that regulate actin-myosin interactions, are activated by association with guanine nucleotide
exchange factors (GEFs), which stimulate binding of GTP
to Rho in exchange for GDP. RhoA is activated by a number
of GEFs in vitro, including SmgGDS [23–25]. A similar
protein in Dictyostelium discoideum termed darlin (Dictyostelium armadillo-like protein) has also been described [26].
We report here the identification of CTNNBL1 (catenin
(cadherin-associated protein), ␤-like 1), a protein that
may structurally resemble armadillo-repeat proteins. Sequence analysis indicates considerable conservation of
CTNNBL1 across diverse species. The CTNNBL1 gene
contains a minor-class (AT-AC) intron, and we have
identified alternatively spliced forms not employing the
minor-class intron splice site. CTNNBL1 has motifs characteristic of transcription factors, consistent with our
observation that overexpressed CTNNBL1 localizes to the
nucleus. Moreover, transfected cells were found to undergo apoptosis, suggesting a role for CTNNBL1 in this
process.
293
Results and discussion
Cloning of human CTNNBL1 cDNA
A 223-bp differential display fragment amplified from
bovine chondrocyte RNA was isolated and sequenced, and
used for a BLAST search against the National Center for
Biotechnology Information (NCBI) expressed sequence tag
(EST) database. A human EST homologous to the bovine
sequence was identified that represented the 5⬘-end sequence of cDNA clone IMAGE: 809437 available from
American Type Culture Collection (ATCC), which was
obtained and sequenced in its entirety. Additional sequence
at the 5⬘-terminal end was provided from overlapping ESTs
found in the database, and a contig containing a putative
open reading frame (ORF) was generated. Because no start
codon contained as part of a Kozak sequence could be
identified, the sequence of the contig’s 5⬘ end was used to
design three primers to conduct a 5⬘ rapid amplification of
cDNA ends (RACE) procedure. Two consecutive PCR reactions were carried out, each time using a nested primer for
specificity, and the final PCR product was TA-cloned. Sequencing of the RACE product revealed a potential start
codon included within a Kozak consensus sequence, preceded by an in-frame translation termination codon. The
sequence of the RACE product combined with the previous
contig sequence represents the full- length 1806-bp human
CTNNBL1 cDNA. The ATG codon for the first methionine
residue is in a context corresponding to the consensus sequence for translation initiation in vertebrate mRNAs
(GCCA/GCCATGG) [27], matching at the most important
positions, that is, ⫹4 (G) and ⫺3 (A), as well as positions
⫺1 (C), ⫺2 (C), and ⫺6 (G). An ORF of 1698 nt starts at
this putative methionine start codon, predicting a protein of
563 amino acids. This is the largest ORF that can be found
in this cDNA. A sequence (AATTAAA) similar to the
consensus polyadenylation signal (AATAAA) is located 86
nt downstream of the stop codon. This putative polyadenylation signal sequence is followed by the beginning of a
putative poly(A) tail at a position 13 nt 3⬘ to the AATTAAA
motif. Northern blot analysis indicates that we have isolated
a nearly full-length CTNNBL1 cDNA, in that CTNNBL1
cDNA sequence is 1806 bp long and the mRNA species
detected by northern blotting is ⬃2.1 kb in length (see
Fig. 3).
CTNNBL1 multiple species sequence alignment
A mouse cDNA was produced by RT-PCR from mouse
tissue-derived RNA. Additionally, a BLAST search using
the human CTNNBL1 cDNA as a query sequence against
the mouse EST database revealed homology with numerous
mouse ESTs; these, together with our cDNA sequence, were
aligned into a full-length contig. EST sequences were also
aligned to obtain a complete Danio rerio (zebrafish) sequence, and apparently complete CTNNBL1 sequences from
294
L. Jabbour et al. / Genomics 81 (2003) 292–303
Drosophila melanogaster, Caenorhabditis elegans, Arabidopsis thaliana, and Schizosaccharomyces pombe (yeast)
were available in GenBank. In Fig. 1, human CTNNBL1
sequence is aligned with CTNNBL1 sequences from six
additional species.
The human CTNNBL1 cDNA sequence predicts a protein
of 563 amino acids with a calculated molecular mass (Mr) of
65.2 kDa, and an estimated isoelectric point of 4.96. The
mouse CTNNBL1 homolog is also 563 amino acid residues
in length, and shows 96% identity to the human sequence.
Evolutionary conservation of this protein is evident when
comparing predicted amino acid sequence from widely divergent species including D. rerio (564 amino acids) D.
melanogaster (581 amino acids), C. elegans (544 amino
acids), A. thaliana (454 amino acids), and S. pombe (564
amino acids). The A. thaliana CTNNBL1 ortholog may be
incomplete, having been assembled from genomic sequence
as a conceptual translation.
Genomic organization and chromosomal localization of
CTNNBL1
The human CTNNBL1 gene contains 16 exons and 15
introns spanning ⬎ 178 kb on chromosome 20. Genomic
organization of CTNNBL1 is detailed in Figs. 2A and 2C.
CTNNBL1 is found on three human genomic clones from
chromosome 20q11.23 to 20q12. Exons 1–7 (bp ⫺84 to
750) are in clone HS1168M15, exons 8 –15 (bp 751–1603)
are in clone HS633020, and exon 16 (bp 1604 –1796) is in
clone HS1118M15.
The splice junctions follow the GT-AG rule [28] for all
but the fourth intron, which contains AT at the 5⬘ splice site
and AC at the 3⬘ splice site, occurring within splice site and
branch site motifs that are highly conserved in “minorclass” introns. Two distinct types of pre-mRNA introns
have been described in eukaryotic genomes (reviewed by
[29,30]). These include a major U2-dependent class and a
minor U12-dependent class, the names of which reflect a
requirement for one of the four snRNAs in each pathway.
U12-dependent introns occur in genes also having multiple
U2-dependent introns. To date only a small number of genes
have been shown to possess U12-dependent introns, most of
which share a set of conserved elements that distinguish
them from U2-dependent introns including conserved sequences at the splice junctions and branchpoint. In the
fourth intron of the CTNNBL1 gene, the 5⬘ splice site
(ATATCCTT) conforms perfectly to the consensus [31].
The 3⬘-splice site (TTCAC) differs from the consensus
(YCCAC) at a single nucleotide. The branchpoint (CCCTTAAC), located 10 residues upstream of the splice site, is in
good agreement with the consensus (TCCTTAAC), with
substitution of a T for C at the first residue.
A search for homologous genes in GenBank revealed a
testes development-related gene, NYD-SP19 [32], as an
alternatively spliced form of CTNNBL1. The NYD-SP19
transcript codes for a protein 376 amino acids in length,
compared to the 563 in CTNNBL1, because of the absence
of exons 1–3 and exon 5 (Fig. 2B). A 68-bp region (designated Ealt) from within the third intron of CTNNBL1 (position 51349 –51417 of clone HS1168M15) serves as the first
exon of NYD-SP19. This alternative first exon is spliced to
exon 4, which is 40 bp longer in the NYD-SP19 transcript,
employing an alternative 5⬘ splice site within intron 4. Exon
4 (⫹40 nt) is spliced to exon 6, skipping exon 5. In NYDSP19, therefore, the U12-dependent (AT-AC) splice site
between exons 4 and 5 is not used, in favor of application of
a GT-AG splice between exon 4 (⫹ 40 nt) and exon 6.
NYD-SP19 is spliced identically to CTNNBL1 from exon 6
through the end of exon 16. The final three residues of the
additional 40 nt at the end of exon 4, because of use of the
alternative splice site, compose the initiation codon (ATG).
The initiation ATG is preceded by two in-frame stop codons
within exon 4.
CTNNBL1 was found to map between markers D2Ucl31
(93 cM) and D2Wsu58e (94 cM) on mouse chromosome 2.
No obvious mouse phenotypes were found to co-localize
with the CTNNBL1 locus. Genomic localization data is also
available for CTNNBL1 in C. elegans and D. melanogaster.
No genomic information is yet available for yeast and bovine homologs. According to the WormBase database
(www.wormbase.sanger.ac.uk), the C. elegans CTNNBL1
genomic sequence is located on chromosome I, and the
approximate genetic map position is I:0.18 or 5,310,802–
5,312,930. In the FlyBase database (www.flybase.bio.
indiana.edu), the D. melanogaster CTNNBL1 gene
(CG11964, FlyBase ID: FBgn0037644) is localized to
85C2– 85C3 on the right arm of chromosome 3 (3R).
Tissue expression profile of CTNNBL1 mRNA
Northern blot analysis, using as a probe a 490-bp DraI
fragment isolated from the insert of cDNA clone 809437,
revealed an mRNA of ⬃2.1 kb in all human tissues tested
(Fig. 3), except in testes, where a doublet at 2.1 and 1.9 kb
was seen. It is possible that the 1.9-kb mRNA band observed in the testes may represent the potentially shorter
NYD-SP16 transcript, lacking exons 1–3 and 5. Although
CTNNBL1 mRNA was detectable in all tissues, CTNNBL1
mRNA was especially abundant in skeletal muscle, placenta, heart, spleen, testes, and thyroid.
Protein structure prediction and putative functional
domains
Several motifs that are variably conserved between species (Fig. 1) may be relevant to the function of CTNNBL1.
A bipartite nuclear localization signal (BNLS) was identified from Lys16 to Lys33 (KRPRDDEEEEQKMRRK) by
the PROSITE program [33]. It is composed of two basic
amino acids, lysine (K) and arginine (R), separated by an
11-amino acid spacer region from a cluster of three basic
amino acids R, R, and K. The BNLS is variably conserved
L. Jabbour et al. / Genomics 81 (2003) 292–303
295
Fig. 1. Multiple sequence alignment showing primary sequence conservation of CTNNBL1 across seven different species. Black (identity) or gray (similarity)
shaded residues indicate that a majority of amino acids are conserved.
296
L. Jabbour et al. / Genomics 81 (2003) 292–303
Fig. 2. Genomic organization of the human CTNNBL1 gene and alternative transcript NYD-SP19. (A) Exon-intron structure of CTNNBL1. Exons are
designated as E1-E16. Numbers below the diagram refer to intron lengths (bp). The location of the minor-class (AT-AC) intron is shown. Exons and introns
are drawn approximately to scale, but exon scale is greatly magnified relative to that of introns. (B) Exon-intron structure of NYD-SP19. Alternative exon
within CTNNBL1 intron 3 is designated Ealt. (C) Lengths of exons and sequence at exon/intron junctions in CTNNBL1. Exon position refers to numbering
in genomic clones HS1168M15, HS633020, and HS1118M15. Intron and untranslated sequences are represented in lowercase letters. whereas translated
residues are in uppercase. Deduced amino acids are indicated above coding sequence. Residues matching the U12-dependent intron splice junction consensus
sequences are indicated in bold italics. Residues matching the U12-dependent intron branchpoint consensus are highlighted.
L. Jabbour et al. / Genomics 81 (2003) 292–303
Fig. 3. Northern blot analysis of CTNNBL1 expression in different human
tissues. A commercially obtained (Clontech) northern blot containing 2 ␮g
of poly(A) RNA per lane was probed with a C-terminal fragment of human
CTNNBL1 cDNA. Blots were probed for human ␤-actin to demonstrate
equivalent loading. CTNNBL1 mRNA band is ⬃2.1 kb in size, relative to
standards run on the same gel. In testes, a doublet with 2.1- and 1.9-kb
bands was observed as seen in the lane labeled testes (s.e.), which represents a shorter exposure of the lane labeled testes.
across species. BNLSs have been identified in several recently described human and mouse genes [34 –36]. Mutation of the BNLS has been shown to alter protein function
[37].
The N-terminal end of CTNNBL1 also includes a highly
acidic region from Asp20 to Glu79 in which 43% of the
residues are aspartic or glutamic acid. Glutamate and aspartate residues are found isolated and in continuous stretches
(Glu22–Glu25, Glu43–Glu45, Glu68 –Glu75) within this region. Similar acidic regions in other nuclear proteins have
been shown to be involved in transcriptional activation
[38,39]. Acidic domains described as responsible for transcriptional activation do not fit a consensus, but are usually
between 40 and 100 amino acids long and rich in acidic
residues.
297
A well-conserved region in the protein sequence that
may represent a nuclear export signal (NES) is found in
CTNNBL1 from Leu164 to Leu174 (LLQELTDIDTL). The
motif in CTNNBL1 resembles the NES motifs in APC protein (LLERLKELNL and LTKRIDSLPL), as well as other
leucine-rich NES motifs present in other proteins [40]. In
APC, the NESs were demonstrated to be involved in the
shuttling of the protein between the nucleus and the cytoplasm. Nuclear export has not yet been demonstrated to
occur with CTNNBL1.
It is interesting to note that the alternatively spliced
product of the CTNNBL1 gene (NYD-SP16) lacks the
BNLS and NES motifs (in CTNNBL1 exons 2 and 5, respectively), as well as the N- terminal acidic region found in
the CTNNBL1 transcript. These features of the alternatively
spliced product suggest that NYD-SP16 may be a nonnuclear protein functionally distinct from CTNNBL1.
A leucine-isoleucine zipper motif, indicating the potential for multimerization of CTNNBL1, is present in the
human and mouse CTNNBL1 protein sequence. This region
is composed of one leucine and three isoleucine residues
positioned every seven residues, forming a potential helix
starting from Leu519 to Ile540. In this region, leucine and
isoleucine residues align along one side of the potential
helix to form a “zipper”. Leucine zipper motifs have been
described in transcription factors, where they allow for
dimers to form. Isoleucine zipper motifs have been described in kinases [41]. A leucine-isoleucine zipper motif
has also been described in steroid receptor-binding factor
(RBF), where it is believed to facilitate dimer formation
[42]. Transfection of CHO cells with the CTNNBL1 deletion
mutant lacking the zipper motif, as described later, suggests
that this region of the protein may have a function in the
biological activity of CTNNBL1. Because leucine and isoleucine are abundant in CTNNBL1, the apparent zipper
motif may alternatively be a structural feature, with these
hydrophobic residues buried within the core of the folded
protein.
When CTNNBL1 was subjected to fold recognition
analysis (3D-PSSM server; http://www.sbg.bio.ic.ac.uk/
⬃3dpssm/), it was found to resemble structurally the
family of proteins containing armadillo repeats, a structural motif originally identified in the D. melanogaster
segment polarity gene product armadillo, the ortholog of
mammalian ␤-catenin. A core region of ␤-catenin is
composed of 12 copies of a 42-amino acid sequence
motif known as an armadillo (arm) repeat. The threedimensional structure of this region has been determined
[2], as well as that of yeast karyopherin ␣ [43]. It has
been established that the arm repeats of arm proteins
form a superhelix of helices that feature a long, positively
charged groove. Fold recognition conducted by querying
a structural database with the predicted structure of
CTNNBL1 matched yeast karyopherin ␣ (E value ⫽ 0.0512;
95% certainty), a protein containing 10 arm repeats (Fig. 4).
Although CTNNBL1 does not have a strong primary se-
298
L. Jabbour et al. / Genomics 81 (2003) 292–303
Fig. 4. Structure-based sequence alignment of predicted arm repeats in CTNNBL1 with the 10 arm repeats of yeast karyopherin-␣ determined from crystal
structure. Residue numbers of CTNNBL1 (top sequence) and yeast karyopherin-␣ (bottom sequence) are indicated at the end of each line. Drawing at the top
indicates the structural elements of a single arm repeat. Thick bars above or below sequences indicate regions predicted to be helical.
quence homology with ␤-catenin, structural homology lies
in the number and position of the hydrophobic residues,
which are predicted to form helices.
As with numerous transcription factors, ␤-catenin levels
are regulated by phosphorylation, and glycogen synthase
kinase-3␤ (GSK-3␤) has been shown to be the primary
kinase responsible for this modification [44]. Although
there is not a strict consensus motif for phosphorylation by
GSK-3␤, many GSK-3␤ substrates require prior phosphorylation by a priming kinase to form the motif S-X-X-XS(P). CTNNBL1 contains numerous potential motifs for
phosphorylation by cAMP-dependent protein kinase
(RKQT33 and KKIS49), protein kinase C (TKR17, SVK83,
SYK95, SVK209, and SPR391), and casein kinase II
(TVVE50, SELD16, TMPD32, TDID72, TLHE76, SVKE210,
STAE305, SNRE328, and TEKE402). If the structural similarity to ␤-catenin suggests a similar functional role for
CTNNBL1, these motifs may be relevant to its regulation in
the cell.
Transient transfections of chinese hamster ovary cells
Protein overexpression in CHO cells has been used earlier to demonstrate the subcellular localization of the protein, and to ascertain its possible involvement in apoptosis
[45,46]. To test the functional significance of motifs in the
CTNNBL1 primary sequence, CHO cells were transfected
with three different constructs coding for CTNNBL1-enhanced green fluorescent protein (CTNNBL1-EGFP) fusion
proteins. The constructs contained the full-length CTNNBL1
protein (P65) or its C-terminal region only (P14), lacking
the BNLS but containing the putative zipper motif, or its
N-terminal region only (P48), lacking the zipper motif but
containing the BNLS and the NES. EGFP fluorescence was
used to visualize subcellular localization of the expressed
proteins. DAPI staining was done to visualize nuclei. Transfection efficiency was on the order of 20 –30%.
Fig. 5 presents the different patterns observed with these
constructs. Figure 5A shows the intracellular distribution of
L. Jabbour et al. / Genomics 81 (2003) 292–303
299
struct, may have a nuclear localization function. The cytoplasmic speckled pattern observed in cells transfected with
EGFP-P14 may be due to abnormal trafficking of this truncated form of CTNNBL1 and sequestration in the endoplasmic reticulum. Alternatively, P14 may be targeted to the
mitochondria, which have been shown to be involved in
apoptosis and observed to migrate to the nuclear periphery
during that event [47].
Nuclear condensation and DNA fragmentation are two
morphological signs of apoptosis [48]. Condensation of
nuclei is observed by DAPI staining in EGFP-P14-transfected cells as well as in EGFP-P65-transfected cells. TDTmediated dUTP-biotin nick end-labeling (TUNEL) assays
confirmed that apoptosis was occurring in transfected cells.
Fig. 5. Cellular localization of CTNNBL1 in CHO cells: DAPI staining and
green fluorescence observed at 358 nm and 488 nm under fluorescence
microscopy. (A) Diffuse cellular expression of the empty EGFP vector.
(B–D) Distribution of the three different fusion proteins. (B) EGFP- P65
nuclear diffuse pattern, (C) EGFP-P48 nuclear diffuse pattern, and (D)
perinuclear clumping of EGFP-P14. (E) Perinuclear clumping and (F)
cytoplasmic speckles of EGFP-P14 are shown at a higher magnification.
Scale bar in A applies to (A–C), and scale bar in (E) applies to (E, F).
EGFP following transfection with the EGFP vector alone,
presenting a diffuse intense expression throughout the cell.
When expressed as an EGFP fusion protein, CTNNBL1 is
found predominantly in the nucleus of transfected cells (Fig.
5B), completely overlapping with the nuclear DAPI staining. The distribution of EGFP-P48 (Fig. 5C) was similarly
restricted to the nucleus. Expression of EGFP-P14 resulted
in a strikingly different pattern of distribution, appearing as
distinct “clumps” at the periphery of the nucleus (Figs. 5D
and 5E). Although the perinuclear aggregate was the most
commonly seen pattern of distribution of EGFP-P14, some
cells showed a speckled distribution in the cytoplasm, especially when cell death was evident by nuclear condensation or fragmentation (Fig. 5F). The exclusively cytoplasmic distribution of P14-EGFP following transfections
suggests that the BNLS, which is excluded from this con-
Fig. 6. Evidence of apoptosis by TUNEL assay in cells expressing EGFPP14 and EGFP-P65. TUNEL-positive cells were observed at 578 nm with
a rhodamine filter under fluorescence microscopy. The green fluorescent
image was superimposed. (A) EGFP-P14 is evident as cytoplasmic aggregates (green arrows), and nuclei of TUNEL-positive cells are indicated by
red arrows. (B) EGFP-P65 expression (green arrow) is co-localized to cells
exhibiting TUNEL- positive nuclei (red arrow). (C) Cells were stained with
DAPI and observed with appropriate filter following transfection with
EGFP-P14. Cells expressing EGFP-P14 (green arrows) exhibited condensed (white arrow) or fragmented (red arrow) nuclei. Nucleus of a
nontransfected cell is indicated with a blue arrow. Scale bars are indicated
in each panel.
300
L. Jabbour et al. / Genomics 81 (2003) 292–303
the nucleus of CHO cells transfected by an EGFPCTNNBL1 expression construct. Transfected cells were observed to undergo changes characteristic of apoptosis. We
have expressed N- and C-terminal fragments of CTNNBL1
to ascertain the potential function of structural motifs in the
primary sequence. In conclusion, this work indicates that
CTNNBL1 could be a regulator of apoptosis in eukaryotic
cells. Clarification of the mechanism of this regulation will
be the subject of future studies.
Materials and methods
Fig. 7. Apoptosis in CHO cells observed following transfections with
EGFP constructs. Graph shows percentage of TUNEL-positive cells as a
function of transfection with EGFP-P14, EGFP- P65, EGFP-P48, or EGFP
vector alone. Nontransfected control cultures were counted for comparison.
Fig. 6 shows TUNEL-positive cells following transfection
with EGFP-P14 (Fig. 6A) and EGFP-P65 (Fig. 6B). Apoptosis following transfection with EGFP-P14 was also demonstrated by the presence of condensed or fragmented nuclei in transfected cells (Fig. 6C). To quantify the induction
of apoptosis following transfection (Fig. 7), cells were
transfected with EGFP-P14, EGFP-P65, EGFP-P48, or
EGFP vector alone, and the percentage of TUNEL-positive
cells as a function of transfection with the CTNNBL1 expression construct or EGFP vector alone was calculated. For
EGFP-P14- and EGFP-65-transfected cells, 60% and 27%
were TUNEL positive, respectively. EGFP-P48 and EGFP
vector-transfected cells showed only 1% TUNEL-positive
nuclei. In each case, the number of TUNEL-positive, nontransfected cells was also negligible. This result suggests
that the C-terminal region of CTNNBL1, which contains the
putative leucine-isoleucine zipper motif, may be required
for the induction of apoptosis.
In summary, we have determined the cDNA sequence of
CTNNBL1 in the human and mouse, and demonstrated substantial conservation of CTNNBL1 in additional diverse
species, suggesting a critical function in eukaryotic cells.
Fold recognition analysis indicated that CTNNBL1 structurally resembles armadillo repeat proteins. We have determined that CTNNBL1 is localized to human chromosome
20q11.2, and to the distal end of mouse chromosome 2. An
interesting feature of the gene is the presence of a minorclass (AT-AC) intron. An alternatively spliced form of the
gene that does not employ the minor-class intron splice sites
has been identified. A number of motifs found in the
CTNNBL1 sequence are characteristic of transcription factors. These features include a highly acidic region representing a potential transcriptional activation domain, a
BNLS that may function in transport of CTNNBL1 into the
nucleus, a NES that may be involved in the shuttling of the
protein between the nucleus and the cytoplasm, and a
leucine- isoleucine zipper motif at the C-terminal end of the
protein that may function in multimerization. Consistent
with features of the protein sequence, CTNNBL1 localizes to
Isolation of human CTNNBL1 cDNA
The 3⬘-terminal region of CTNNBL1 was initially identified by differential display of bovine chondrocyte-derived
mRNA expressed in a model for cartilage repair. Although
CTNNBL1 was subsequently determined to be unregulated
in this experimental system (data not shown), a human
partial sequence available as an EST (IMAGE:809437), was
used to design PCR primers for 5⬘-RACE (Invitrogen,
Carlsbad, CA) analysis to obtain the full-length human
cDNA sequence. Template cDNA was reverse-transcribed
from human placenta mRNA (BD Biosciences Clontech,
Palo Alto, CA) priming with oligo(dT), purified, and dCtailed. cDNA was denatured at 94°C for 30 s, then amplification was carried out for 30 cycles using the following
parameters: denaturation at 94°C for 1 min, annealing at
54°C for 1 min using manufacturer’s primer “AAP” as
upper primer with a gene-specific lower primer (no. 1,
5⬘-ATTTTCTTCACTGAGCTT-3⬘) matching sequence in
the 5⬘ region of ATCC clone 809437, and elongation at
72°C for 2 min. PCR was terminated with a final elongation
at 72°C for 7 min. A second round of PCR was conducted
at an annealing temperature of 54°C for 30 cycles using
manufacturer’s primer “UAP” as upper primer and a genespecific lower primer (no. 2, 5⬘-TCCAATGGCTCCTCCTCT-3⬘) matching sequence immediately upstream of
gene-specific primer no. 1. A PCR product was generated
from that “nested” PCR and TA-cloned into the PCRII
vector (Invitrogen, Carlsbad, CA).
Amplification of mouse CTNNBL1 cDNA and chromosome
mapping
Mouse ESTs homologous to human CTNNBL1 were
aligned using the Vector NTI suite program (Informax,
Bethesda, MD), and a mouse contig was generated. On the
basis of this sequence, five oligonucleotide primers were
designed (no. 1, 5⬘-TGGTTCGGGAGTTGAGTGGAG-3⬘;
no. 2, 5⬘-TTTGTTCAAGCCATACAACTGT-3⬘; no. 3, 5⬘ACATCATTCAGGAGATGCACG-3⬘; no. 4, 5⬘-ACGATCTTGATGGAGCTGCCA-3⬘; no. 5, 5⬘-TGAAGTGCTGGCCATCCTCCT-3⬘) and used in six different
combinations to amplify mouse cDNA from mouse embryo,
L. Jabbour et al. / Genomics 81 (2003) 292–303
mouse placenta, and mouse spleen mRNA. cDNA synthesis
was accomplished using SuperScript pre- amplification system (Invitrogen, Carlsbad, CA). cDNA was denatured at
94°C for 30 s, then amplification was carried out for 30
cycles using the following parameters: denaturation at 94°C
for 1 min, annealing at 55°C for 1 min, and elongation at
72°C for 2 min. PCR was terminated by one final elongation
at 72°C for 7 min. All primer pairs amplified predicted- size
products. The 1830-bp product representing the full-length
mouse cDNA was cloned and sequenced.
A murine CTNNBL1 PCR primer pair (5⬘-CCAAGATGCCCTTCGATG-3⬘, and 5⬘-GGATAACTGCTGAAGAAG-3⬘) predicted to amplify a 367-bp product between
exon 7 and 8 in the human gene was used to amplify an
intragenic CTNNBL1 fragment from C57BL/6J, Mus spretus and BL/6J ⫻ spretus F1 genomic DNA. The F1 amplimer contained a denaturing HPLC (dHPLC)-detectable
heteroduplex. Consequently, amplimers from the BSS backcross mapping panel available from the Jackson Laboratory
[49] were scored as homozygous for the spretus allele or
heterozygous for C57BL/6J and spretus alleles based on
dHPLC. Results were submitted to the Jackson Laboratory,
where precise mapping was determined relative to previously tested markers.
Sequence analysis
Automated sequencing was done by the DNA sequencing core facility of the Northeastern Ohio Multipurpose
Arthritis Center of Case Western Reserve University. Human and mouse cDNA were translated using MacVector
software (Accelrys, San Diego, CA). The putative protein
sequence was analyzed using the “proteomic tools” available from the Expasy website (http://www.expasy.ch/).
BLAST searches were conducted through the NCBI database. AssemblyLign software (Accelrys, San Diego, CA)
was used to generate human contigs, and the Vector NTI
suite program (Informax, Bethesda, MD) was used to generate the mouse contigs. Multiple sequence alignments were
generated with MacVector (Accelrys, San Diego, CA) software using the Clustal W algorithm [50].
Cloning of expression constructs
Primers were designed to amplify three regions of
CTNNBL1 from placenta cDNA (BD Biosciences Clontech,
Palo Alto, CA), such that these regions could be cloned in
frame with the EGFP gene in the EGFP-N1 vector (BD
Biosciences Clontech, Palo Alto, CA). Upper primers contained a HindIII site and lower primers contained a BamHI
site for ligation into the EGFP-N1 vector. Three constructs
were generated: P65, representing the full-length sequence
coding for CTNNBL1 (563 amino acids), amplified with
5⬘-GGTCAAGCTTACCATGGACGTGGGCGAACT-3⬘
and 5⬘- GGTCGGATCCCGGAAGTTCTCCAG-3⬘; P48,
representing the region coding for the first 441 amino acids
301
that contains the N-terminal BNLS but not the C-terminal
putative isoleucine zipper motif, amplified with 5⬘-GGTCAAGCTTACCATGGACGTGGGCGAACT-3⬘ and 5⬘GGTCGGATCCCGTAGTCTGTCAACCTTCTCACT-3⬘;
P14, representing the region coding for the last 122 amino
acids that contains the C-terminal putative isoleucine zipper
motif but not the N-terminal BNLS, amplified with 5⬘GGTCAAGCTTCTAATGGAGTTGCATTTTAAA-3⬘ and
5⬘-GGTCGGATCCCGGAAGTTCTCCAG-3⬘. DNA was
denatured at 94°C for 30 s, then amplification was carried
out for 30 cycles using the following parameters: denaturation at 94°C for 1 min, annealing at 59°C for 1 min, and
elongation at 72°C for 2 min. PCR was terminated by one
final elongation at 72°C for 7 min. PCR products of expected sizes were generated for P14, P48, and P65, which
are 381, 1338, and 1704 bp, respectively. PCR products
were ligated into the EGFP-N1 vector.
Northern blotting
ATCC clone 809437 was purchased and used to probe
human multiple tissue blots H, H2, and H3 (BD Biosciences
Clontech, Palo Alto, CA), containing 2 ␮g/lane mRNA
isolated from a variety of different human tissues. Membranes were probed with a random primer-radiolabeled
(Roche Diagnostics, Indianapolis, IN) 458-bp DraI fragment isolated from ATCC clone 809437. Hybridization was
carried out using UltraHyb (Ambion, Austin, TX) according
to the manufacturer’s instructions. Membranes were autoradiographed to MR film (Eastman Kodak, Rochester, NY).
Transfection of CHO cells and TUNEL assay
Transfections were carried out in wild-type CHO cells
using the Fugene reagent (Roche Diagnostics, Indianapolis,
IN). CHO-K1 cells (ATCC, Manassas, VA) were cultured
in Ham’s F12-10% (vol/vol) FBS and grown to 50 – 80%
confluency in chamber wells (Nalge Nunc International
Corp., Naperville, IL). Transfection reagents and DNA were
used at volumes and amounts based on the manufacturer’s
recommended ranges. Cells were fixed 24 h post-transfection, first in 70% (vol/vol) ethanol (1 min) then in 90%
(vol/vol) ethanol (1 min). Cells were then assayed for DNA
fragmentation by the TUNEL assay (Roche Diagnostics,
Indianapolis, IN) according to manufacturer’s instructions.
Cells were stained with DAPI at 1 ␮g/ml for 15 min at
37°C, overlaid with SlowFade (Molecular Probes Inc., Eugene, OR), and covered with a coverslip. Cells were observed at 358 nm, 488 nm, and 578 nm with the appropriate
filters using a Nikon microscope equipped with a UV lamp.
For each experiment, 10 random fields, each containing an
average of 200 cells, were counted and the percentage of
TUNEL-positive cells as a function of transfection with
EGFP-P14, EGFP-P65, EGFP-P48, or EGFP vector alone
was calculated. Images were captured with a Spot digital
camera (Diagnostic Instruments, Sterling Heights, MI).
302
L. Jabbour et al. / Genomics 81 (2003) 292–303
Acknowledgments
We thank Matthew Stewart for providing mouse mRNA,
Matthew Warman for his assistance in mapping the mouse
CTNNBL1 gene, and Patrick Klepcyk for DNA sequencing.
This work was supported by NIH AG13856, AR 20618, and
AR46196.
References
[1] M. Peifer, S. Berg, A. B. Reynolds, A repeating amino acid motif
shared by proteins with diverse cellular roles, Cell 76 (1994) 789 –
791.
[2] A. H. Huber, W. J. Nelson, W. I. Weis, Three-dimensional structure
of the armadillo repeat region of ␤-catenin, Cell 90 (1997) 871– 882.
[3] M. R. Groves, D. Barford, Topological characteristics of helical
repeat proteins, Curr. Opin. Struct. Biol. 9 (1999) 383–389.
[4] B. Riggleman, E. Wieschaus, P. Schedl, Molecular analysis of the
armadillo locus: uniformly distributed transcripts and a protein with
novel internal repeats are associated with a Drosophila segment
polarity gene, Genes Dev. 3 (1989) 96 –113.
[5] P. D. McCrea, C. W. Turck, B. Gumbiner, A homolog of the armadillo protein in Drosophila (plakoglobin) associated with E-cadherin,
Science 254 (1991) 1359 –1361.
[6] D. L. Rimm, E. R. Koslov, P. Kebriaei, C. D. Cianci, J. S. Morrow,
␣1(E)-catenin is an actin-binding and -bundling protein mediating the
attachment of F-actin to the membrane adhesion complex, Proc. Natl.
Acad. Sci. USA 92 (1995) 8813– 8817.
[7] H. Aberle, A. Bauer, J. Stappert, A. Kispert, R. Kemler, ␤-catenin is
a target for the ubiquitin-proteasome pathway, EMBO J. 16 (1997)
3797–3804.
[8] Q. Eastman, R. Grosschedl, Regulation of LEF-1/TCF transcription
factors by Wnt and other signals, Curr. Opin. Cell Biol. 11 (1999)
233–240.
[9] K. W. Kinzler, et al., Identification of FAP locus genes from chromosome 5q21, Science 253 (1991) 661– 665.
[10] V. Korinek, et al., Constitutive transcriptional activation by a ␤-catenin-Tcf complex in APC⫺/⫺ colon carcinoma, Science 275 (1997)
1784 –1787.
[11] W. W. Franke, et al., Molecular cloning and amino acid sequence of
human plakoglobin, the common junctional plaque protein, Proc.
Natl. Acad. Sci. USA 86 (1989) 4027– 4031.
[12] A. Schmidt, et al., Desmosomes and cytoskeletal architecture in
epithelial differentiation: cell type-specific plaque components and
intermediate filament anchorage, Eur. J. Cell Biol. 65 (1994) 229 –
245.
[13] A. B. Reynolds, D. J. Roesel, S. B. Kanner, J. T. Parsons, Transformation-specific tyrosine phosphorylation of a novel cellular protein in
chicken cells expressing oncogenic variants of the avian cellular src
gene, Mol. Cell Biol. 9 (1989) 629 – 638.
[14] S. Shibamoto, et al., Association of p120, a tyrosine kinase substrate,
with E-cadherin/catenin complexes, J. Cell Biol. 128 (1995) 949 –
957.
[15] J. M. Daniel, A. B. Reynolds, The tyrosine kinase substrate p120cas
binds directly to E-cadherin but not to the adenomatous polyposis coli
protein or ␣-catenin, Mol. Cell Biol. 15 (1995) 4819 – 4824.
[16] D. Gorlich, S. Prehn, R. A. Laskey, E. Hartmann, Isolation of a
protein that is essential for the first step of nuclear protein import, Cell
79 (1994) 767–778.
[17] C. Enenkel, G. Blobel, M. Rexach, Identification of a yeast karyopherin heterodimer that targets import substrate to mammalian nuclear pore complexes, J. Biol. Chem. 270 (1995) 16499 –16502.
[18] J. Moroianu, G. Blobel, A. Radu, The binding site of karyopherin ␣
for karyopherin ␤ overlaps with a nuclear localization sequence, Proc.
Natl. Acad. Sci. USA 93 (1996) 6572– 6576.
[19] P. Cortes, Z. S. Ye, D. Baltimore, RAG-1 interacts with the repeated
amino acid motif of the human homologue of the yeast protein SRP1,
Proc. Natl. Acad. Sci. USA 91 (1994) 7633–7637.
[20] R. Yano, M. Oakes, M. Yamaghishi, J. A. Dodd, M. Nomura, Cloning
and characterization of SRP1, a suppressor of temperature-sensitive
RNA polymerase I mutations, in Saccharomyces cerevisiae, Mol.
Cell. Biol. 12 (1992) 5640 –5651.
[21] M. M. Tabb, P. Tongaonkar, L. Vu, M. Nomura, Evidence for
separable functions of Srp1p, the yeast homolog of importin ␣
(Karyopherin ␣): role for Srp1p and Sts1p in protein degradation,
Mol. Cell Biol. 20 (2000) 6062– 6073.
[22] A. Kikuchi, et al., Molecular cloning of the human cDNA for a
stimulatory GDP/GTP exchange protein for c-Ki-ras p21 and smg
p21, Oncogene 7 (1992) 289 –293.
[23] T. Mizuno, et al., A stimulatory GDP/GTP exchange protein for smg
p21 is active on the post-translationally processed form of c-Ki-ras
p21 and rhoA p21, Proc. Natl. Acad. Sci. USA 88 (1991) 6442– 6446.
[24] S. Orita, et al., Comparison of kinetic properties between two mammalian ras p21 GDP/GTP exchange proteins, ras guanine nucleotidereleasing factor and smg GDP dissociation stimulation, J. Biol. Chem.
268 (1993) 25542–25546.
[25] T. H. Chuang, X. Xu, L. A. Quilliam, G. M. Bokoch, SmgGDS
stabilizes nucleotide-bound and -free forms of the Rac1 GTP-binding
protein and stimulates GTP/GDP exchange through a substituted
enzyme mechanism, Biochem. J. 303 (1994) 761–767.
[26] K. K. Vithalani, et al., Identification of darlin, a Dictyostelium protein
with Armadillo-like repeats that binds to small GTPases and is important for the proper aggregation of developing cells, Mol. Biol. Cell
9 (1998) 3095–3106.
[27] M. Kozak, At least six nucleotides preceding the AUG initiator codon
enhance translation in mammalian cells, J. Mol. Biol. 196 (1987)
947–950.
[28] R. Breathnach, C. Benoist, K. O’Hare, F. Gannon, P. Chambon,
Ovalbumin gene: evidence for a leader sequence in mRNA and DNA
sequences at the exon-intron boundaries, Proc. Natl. Acad. Sci. USA
75 (1978) 4853– 4857.
[29] C. B. Burge, R. A. Padgett, P. A. Sharp, Evolutionary fates and
origins of U12-type introns, Mol. Cell 2 (1998) 773–785.
[30] Q. Wu, A. R. Krainer, AT-AC pre-mRNA splicing mechanisms and
conservation of minor introns in voltage-gated ion channel genes,
Mol. Cell Biol. 19 (1999) 3225–3236.
[31] S. L. Hall, R. A. Padgett, Conserved sequences in a class of rare
eukaryotic nuclear introns with non-consensus splice sites, J. Mol.
Biol. 239 (1994) 357–365.
[32] Z. M. Zhou, et al., Expression of a novel reticulon-like gene in human
testis, Reproduction 123 (2002) 227–234.
[33] A. Bairoch, P. Bucher, K. Hofmann, PROSITE, Nucleic Acids Res.
25 (1997) 217–221.
[34] J. Guo, G. C. Sen, Characterization of the interaction between the
interferon-induced protein P56 and the int6 protein encoded by a
locus of insertion of the mouse mammary tumor virus, J. Virol. 74
(2000) 1892–1899.
[35] J. R. Dunlevy, B. L. Berryhill, J. P. Vergnes, N. SundarRaj, J. R.
Hassell, Cloning, chromosomal localization, and characterization of
cDNA from a novel gene, SH3BP4, expressed by human corneal
fibroblasts, Genomics 62 (1999) 519 –524.
[36] Z. Yang, C. Y. Yu, Organizations and gene duplications of the human
and mouse MHC complement gene clusters (1), Exp. Clin. Immunogenet. 17 (2000) 1–17.
[37] C. Cinti, et al., Genetic alterations disrupting the nuclear localization
of the retinoblastoma-related gene RB2/p130 in human tumor cell
lines and primary tumors, Cancer Res. 60 (2000) 383–389.
[38] J. Ma, M. Ptashne, The carboxy-terminal 30 amino acids of GAL4 are
recognized by GAL80, Cell 50 (1987) 137–142.
L. Jabbour et al. / Genomics 81 (2003) 292–303
[39] I. A. Hope, S. Mahadevan, K. Struhl, Structural and functional characterization of the short acidic transcriptional activation region of
yeast GCN4 protein, Nature 333 (1988) 635– 640.
[40] K. L. Neufeld, et al., Adenomatous polyposis coli protein contains
two nuclear export signals and shuttles between the nucleus and
cytoplasm, Proc. Natl. Acad. Sci. USA 97 (2000) 12085–12090.
[41] D. S. Dorow, L. Devereux, T. de Kretser, Identification of a new
family of human epithelial protein kinases containing two leucine/
isoleucine-zipper domains, Eur. J. Biochem. 213 (1993) 701–710.
[42] T. J. Barrett, et al., Interactions of the nuclear matrix-associated
steroid receptor binding factor with its DNA binding element in the
c-myc gene promoter, Biochemistry 39 (2000) 753–762.
[43] E. Conti, M. Uy, L. Leighton, G. Blobel, J. Kuriyan, Crystallographic
analysis of the recognition of a nuclear localization signal by the
nuclear import factor karyopherin ␣, Cell 94 (1998) 193–204.
[44] C. Yost, et al., The axis-inducing activity, stability, and subcellular
distribution of ␤-catenin is regulated in Xenopus embryos by glycogen synthase kinase 3, Genes Dev. 10 (1996) 1443–1454.
303
[45] R. J. Krieser, A. Eastman, The cloning and expression of human
deoxyribonuclease II. A possible role in apoptosis, J. Biol. Chem. 273
(1998) 30909 –30914.
[46] A. W. Gibson, T. Cheng, R. N. Johnston, Apoptosis induced by
c-myc overexpression is dependent on growth conditions, Exp. Cell
Res. 218 (1995) 351–358.
[47] S. Desagher, J. C. Martinou, Mitochondria as the central control point
of apoptosis, Trends Cell Biol. 10 (2000) 369 –377.
[48] G. Hacker, The morphology of apoptosis, Cell Tissue Res. 301 (2000)
5–17.
[49] L. B. Rowe, et al., Maps from two interspecific backcross DNA
panels available as a community genetic mapping resource, Mamm.
Genome 5 (1994) 253–274.
[50] J. D. Thompson, D. G. Higgins, T. J. Gibson, CLUSTAL W:
improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties
and weight matrix choice, Nucleic Acids Res. 22 (1994) 4673–
4680.