1. introduction

advertisement
The wheat PDI (Protein Disulfide Isomerase) genes.
M. Ciaffi, O.A. Tanzarella and E. Porceddu*
Dipartimento di Agrobiologia e Agrochimica, Università della Tuscia, 01100 Viterbo, Italy
* Corresponding author:
Prof. Enrico Porceddu
Dipartimento di Agrobiologia e Agrochimica
Via S. Camillo De Lellis
01100 Viterbo
ITALY
E-mail: porceddu@unitus.it
Tel: 0039 0761 357231
Fax: 0039 0761 357256
Keywords: Gene structure, Gene expression, Triticum, Protein folding, Wheat quality
1
1. INTRODUCTION
Cultivated on over 217 million hectares of land, spanning from the Scandinavian peninsula in the North to Argentina in the South, wheat provides a total annual production of over
620 million Mt of grain, and represents the staple food for over two billion people, more than
one third of world population. Wheat grain is almost exclusively consumed after processing to
give rise to many different types of cooked food. This ability is largely due to its storage proteins, which play an important role in determining dough properties.
Wheat storage proteins consist primarily of prolamins, which are synthesised in the
developing endosperm and targeted to the endoplasmic reticulum (ER) lumen, where they are
folded and connected by intermolecular disulfide bonds to form large aggregates (Shewry and
Tatham 1997). Generally, these protein polymers are deposited in massive protein bodies
within vesicles that shear directly off the ER, although some of them (primarily the monomeric gliadins) are transported to vesicles via the Golgi apparatus (Rubin et al. 1992). Therefore,
genes encoding seed storage proteins as well as factors that affect their deposition, such as
molecular chaperones and foldase enzymes, are of particular relevance in wheat industry.
Even though wheat storage proteins have been the object of a wide range of studies both at
chemical and genetic levels (reviewed by Shewry et al. 2003), knowledge of factors affecting
their folding and deposition is still extremely limited.
The aim of this paper is to summarise recent achievements in understanding the peculiarities of protein disulfide isomerase (PDI), an enzyme possibly involved in the folding of
endosperm storage proteins, and will specifically focus on the molecular characterisation of
wheat PDI genes.
2. THE PDI ENZYME
2
During maturation of the secretory proteins, disulfide bonds cross-linking specific cysteines are added to stabilize a protein or to join covalently different polypeptides. These bonds
are often crucial for the stability of the final protein structure (Tu and Weissmann 2004). The
slow in vitro spontaneous folding of proteins (hours to days, or never) is incompatible with
the time scale of their in vivo secretion (about 30-60 min). The eukaryotic cells have reconciled this incongruence through a specialized redox environment (Frand et al. 2000), i. e. ER,
which is equipped with enzymatic catalysts promoting disulfide bond formation and isomerization (Denecke 1996; Fassio and Sitia 2002). PDI is one of them. It is a long living, abundant
protein able to catalyze thiol-disulfide oxidation, reduction and isomerization, the latter originating directly through intramolecular disulfide rearrangement or through cycles of reduction
and oxidation (Schwaller et al. 2003).
Complexity, structure and function of PDI have been extensively studied in mammalians (for reviews see Ferrari and Soling 1999; Wilkinson and Gilbert 2004; Ellgaard and Ruddock 2005), wherein it is a dimer consisting of two identical subunits of about 57 kDa and is
one of the most abundant proteins in the ER (Lyles and Gilbert 1991). Analyses of sequence
homology and NMR studies have shown that human PDI has a modular structure comprising
five domains: a, b, b’, a’ and c (Fig. 1). The a and a’ domains are homologous to thioredoxin,
a small protein involved in many cytoplasmic redox reactions (Freedman et al. 1994); both a
and a' domains contain a catalytic site for isomerase and redox activities consisting of the
Cys-Gly-His-Cys amino acid sequence (Noiva and Lennarz 1992). The middle b and b’ domains have a secondary structure similar to that of the a and a’ domains (Kemmink et al.
1997; Ferrari and Soling 1999), but do not show any significant homology to tioredoxin. The
c domain, at the C end, is rich of acidic residues typical of calcium binding proteins (Lucero
and Kaminer 1999), and has a KDEL sequence for ER protein retention (Denecke et al. 1992).
3
PDI was initially considered a catalyst for disulfide bond formation (Freedman et al.
1994), but its ability of binding to unfolded or partly folded proteins, preventing their aggregation, also suggested a chaperone role (Hayano et al. 1995; Yao et al. 1997), as part of the
quality control system for the correct folding of the proteins synthesized in the ER (Turano et
al. 2002). Several additional important functions of PDI have been detected later (Ferrari and
Soling 1999; Pihlajanieni et al. 1987; Wetterau et al. 1990; Lucero and Kaminer 1999; Cheng
et al. 1987; Bennet et al. 1988; Tsai et al. 2001; Tanaka et al. 2000). The human PDI gene is
located at 17q25, is about 18 kb in size and consists of 11 exons and 10 introns, both varying
in size (Tasanen et al. 1988).
Typical PDI is the most prominent member of a growing family of related proteins
characterised by one, two or three thioredoxin-like active domains (Ferrari and Soling 1999).
Several PDI-like genes encoding proteins with unusual primary structure, different expression
patterns and exhibiting a surprising range of activities have recently been identified in every
extensively sequenced mammalian genome (Turano et al. 2002; Clissold and Bicknell 2003;
Wilkinson and Gilbert 2004; Ellgaard and Ruddock 2005). The term PDI refers to both the
family and the first member of the family isolated in mammalians, which is also the best
characterized. About twenty members of the PDI family have recently been identified in humans (Ellgaard and Ruddock, 2005).
All the proteins of the PDI family are located in the ER, some of them in relatively
high amounts, and are often considered to perform their functions exclusively in this compartment. Turano et al. (2002), however, have indicated that at least some members of the
family are also present in unusual subcellular locations (i.e. the cell surface, the extracellular
space, the cytosol and the nucleus), which are reached through an export mechanism not yet
identified (Turano et al. 2002). On few occurrences their function in locations different from
ER is clearly related to their redox properties, but in most cases their mechanism of action is
4
still unknown, although their tendency to associate with other proteins, or even with DNA,
might be the main factor related to their activities (Turano et al. 2002).
3. PLANT PDI
Information on structural and functional aspects of PDIs and PDI-like proteins in
plants is still very limited. PDI cDNA sequences have been cloned and sequenced from species such as alfalfa (Shorrosh and Dixon 1991), barley (Chen and Hayes 1994), common and
durum wheat (Shimoni et al. 1995a; Ciaffi et al. 2001), maize (Li and Larkins 1996) and castor bean (Coughlan et al. 1996); two distinct cDNA sequences encoding PDI-related proteins
have been isolated and characterized in alfalfa (Shorrosh and Dixon 1992) and carrot (Xu et
al. 2002). Recently the available whole genome sequences of Arabidopsis and rice allowed to
investigate in detail the complexity and diversity of PDI-like genes in plants. The genomewide structural annotation of the PDI gene family in Arabidopsis has resulted in the identification of 13 genes distributed across all its five chromosomes (Houston et al. 2005). Using the
Arabidopsis PDI-like sequences in iterative BLAST searches of public and proprietary sequence databases, an orthologous set of 12 PDI-like sequences was identified in both rice and
maize (Houston et al. 2005). Phylogenetic analyses conducted on the Arabidopsis, rice and
maize proteins indicated that the plant PDI family may include at least eight different gene
subfamilies, whose members can be grouped on the basis of sequence homology, number and
position of active domains and presence/absence of the KDEL signal for ER retention. Members of the first five groups or subfamilies (I-V) had two thioredoxin-like active domains and
showed structural similarities to different PDI-like proteins in other higher eukaryotes. The
remaining three subfamilies (VI-VIII) contain proteins with a single thioredoxin-like active
domain. Proteins in phylogenetic groups I, II and III were similar in size (500-560 aa) and
were predicted to be secretory proteins with putative signal peptides and the C-terminal
5
KDEL-like ER retention signals. The first group includes the typical PDIs identified in several plant species, whose members, as it will be reported in the following paragraphs, have been
previously characterized in wheat (Ciaffi et al. 2005). Proteins represented in group IV were
approximately 360 aa in length, but lacked a KDEL-like for ER retention. Members of the
subfamily V were longer (approximately 440 aa) and had a KDEL-like ER retention signal at
their C-termini. Aside from having a single thioredoxin domain, group VI, VII and VIII proteins shared few structural features. Such diversity is not surprising, given the divergent evolutionary origins indicated by phylogenetic analyses. The close phylogenetic relationship between the nucleotide sequences encoding single domain group VII proteins and the Nterminal thioredoxin domain of members of the groups I, II and III is consistent with the hypothesis that the group VII proteins have emerged by domain loss from a two domain PDIlike precursor. Members of the VI and VIII subfamily resembled small single-domain PDI
from lower eukaryotes, such as yeast and Giardia lamblia, indicating that they could have retained an ancestral domain structure. PDI-like proteins of group VI were the shortest members
of the plant PDI family, with only approximately 150 aa, whereas proteins of the groups VII
and VIII were much larger (418-485 aa). PDI-like proteins from groups VI and VII were predicted to be secretory proteins with signal peptides, but none of the single thioredoxin domain
proteins had KDEL-like sequences. Finally, all the members of the groups VII and VIII contain a transmembrane segment and hence were predicted to be membrane proteins.
Despite the recent analysis of the complexity and diversity of the PDI gene family of
plants, there are still numerous unanswered questions concerning the location and physiological function of the individual proteins. Up to now most studies on the molecular characterization, transcriptional regulation and intracellular localization of members of the PDI family
have been carried out on the typical PDIs of some cereal species (Chen and Hayes 1994; Shimoni et al. 1995b; Li and Larkins 1996). These studies indicated that the PDI enzyme may
6
accomplish an important role in the folding of plant secretory proteins, and particularly in the
formation of endosperm protein bodies. Lines of evidence supporting a role for PDI in the
storage protein deposition in cereals derive also from the analysis of some maize and rice mutants producing seeds with altered endosperm protein bodies. PDI expression in the endosperm of a maize floury2 mutant was considered to be induced by a systemic stress signal due
to the production of a defective -zein storage protein resulting in the abnormal association of
protein bodies (Li and Larkins 1996). Two more maize mutants, mucronate (mc) and defective endosperm B30 (de*-B30), exhibit an endosperm-specific increase of PDI as a result of
structural changes in storage proteins (Wrobel 1996; Kim et al. 2004). Recent investigations
on a natural rice mutant with irregular protein bodies found that the main storage proteins of
rice, that is prolamins and glutelins, which normally form discrete protein bodies containing
separately either protein class, failed to segregate correctly, forming new and smaller protein
bodies containing both prolamin and glutelin precursors bound by disulfide bonds (Takemoto
et al. 2002). The failure of forming correct protein bodies was demonstrated to be due to absence of PDI expression, suggesting an essential and direct role of PDI in the segregation of
the two classes of polypeptides and formation of protein bodies.
4. THE WHEAT PDI
4.1. Intracellullar localization and expression analyses
The presence of PDI in wheat endosperm was initially demonstrated by Roden at al.
(1982), who showed that PDI activity was associated with ER fractions isolated by ultracentrifugation of homogenates of developing endosperms. Similarly, Livesley et al. (1992)
showed that PDI was associated with microsomal (ER) fraction from embryos and aleurone
layers of dry mature and germinating grains of wheat. To study more accurately the role of
PDI in the maturation of plant proteins, Shimoni et al. (1995b) purified PDI from wheat endosperm and showed that wheat PDI appears as a 60-kD glycoprotein and is among the most
7
abundant proteins within the ER of developing grains. Subcellular localization analysis and
electron micrographs of immunogold labelling showed that PDI is not only present in the lumen of the ER, but it is also co-localized with the storage proteins in protein bodies. The
presence of PDI in the ER of wheat endosperm does not prove that it is necessary for the folding of storage proteins in vivo, although Bulleid and Freedman (1988a) showed that it was
able to catalyse the formation of intra-molecular disulfide bonds in a -gliadin synthesised in
vitro from a cloned cDNA. The newly synthesised polypeptide was transported into the microsomal fraction (ER) from dog pancreas. When transcription and translation were carried
out under conditions favouring disulfide bond formation, the protein had a faster electrophoretic mobility than when separated after reduction, indicating the presence of intra-chain disulfide bonds. This phenomenon was not observed when the microsomes were treated to remove PDI and other soluble lumenal proteins, but was restored by the addition of purified
PDI. Similar studies were carried out with genes encoding the high molecular weight glutenin
subunits 1Dy10 and 1Dy12 and a low molecular weight glutenin subunit (Bulleid and Freedman 1988b; 1992); in all cases the products synthesised under conditions favouring disulfide
bond formation migrated slightly faster than the fully reduced proteins. This higher mobility
was ascribed to the formation of intra-chain disulfide bonds. However, the system was not
able to precisely mimic the situation in the ER of wheat endosperm, since proteins failed to
form disulfide-stabilised oligomers. Authors offered different possible explanations, such as
low concentration of proteins, short time course or use of a heterologous (dog pancreas) system.
Analysis of PDI expression in durum wheat showed that its mRNA was constitutively
present in several tissues, but it was expressed at a very low level in coleoptiles, roots, leaves
and florets, and at a very high level in developing caryopses, where the transcript content was
very high during the early stages of seed development (9 to 17 days after anthesis) (Ciaffi et
8
al. 2001). This finding was in agreement with the results obtained in common wheat by Shimoni et al. (1995b) and Grimwade et al. (1996), who had shown, at protein and mRNA levels
respectively, that the temporal expression of PDI was not tightly co-ordinated with the expression of storage proteins, starting earlier in grain development and reaching a maximum
before the period of highest gluten synthesis. Although available data do not allow to affirm
unequivocably that PDI is essential for folding and deposition of wheat storage proteins, they
indicated its involvement at some early stage of protein processing and protein body formation, or that it may have a more general, housekeeping role in the processing of secretory
proteins.
4.2. Characterization of PDI genes and of their promoters
Wheat genes coding for typical PDI in wheat have been located in chromosome group
four by Ciaffi et al. (1999). Using a probe consisting in most of the PDI coding sequence
cloned by PCR amplification using primers designed on the basis of the published cDNA sequence (Shimoni et al. 1995a), they were able to detect by Southern analysis of DNA of Triticum aestivum cv. Chinese Spring (CS), four fragments of different length, which were located
in decreasing order on chromosome arms 4AL, 4BS, 4DS and 1BS by CS di-telosomic lines.
However, recent findings indicate that the PDI gene on chromosome 1B may correspond to a
pseudogene, missing part of the 3’ coding sequence (Johnson and Bhave 2004). Location of a
PDI gene sequence in the long rather than in the short arm of chromosome 4A, was considered consistent with the pericentric inversion in this chromosome (Devos et al. 1995). The
number of PDI gene sequences is in line with results obtained in other plant species, such as
alfalfa (Shorrosh and Dixon 1991), maize (Li and Larkins 1996) and castor bean (Coughlan et
al. 1996), which possess single copy sequence, whereas two independent PDI loci were de-
9
tected in barley (Chen and Hayes 1994). More recently, two and three PDI gene sequences
have been identified in Arabidopsis and rice genomes (Houston et al. 2005).
Assessment of Restriction Fragment Length Polymorphisms (RFLPs) in different accessions and lines of hexaploid and tetraploid cultivated species of Triticum, showed that the
restriction fragments located on chromosomes of the homoeologous group 4 were highly conserved and that polymorphism occurred only at the 1B locus. Similar analyses performed on
23 species of Triticum and Aegilops (Ciaffi et al. 2000), indicated that PDI restriction fragments were highly conserved within each species and confirmed that plant PDI is encoded either by single or few copy sequences per genome, respectively in diploid and polyploid species. The Aegilops species of the Sitopsis section showed a rather complex pattern and a high
level of intraspecific variation, with the exception of Ae. searsii, which possessed a single,
conserved PDI fragment. T. urartu and Ae. tauschii showed single fragments with the same
mobility as those located respectively in the A and D genomes of polyploid species, whereas
differences were observed between the hybridization patterns of T. monococcum and T. boeoticum and that of the A genome. The hybridization pattern of T. zhukovskyi was identical to
that of T. timopheevi, except for the presence of an additional strong hybridization fragment
having the same mobility as the one detected in T. boeoticum and T. monococcum.
The nucleotide sequences of the three genes located on genomes A, B, and D (designed as GPDI-4A, GPDI-4B and GPDI-4D) were 3561, 3527 and 3466 bp long, respectively (Ciaffi et al. 2005). Their alignment and comparison with the corresponding cDNA sequences (indicated as CPDI-4A, CPDI-4B and CPDI-4D) showed that they possess a conserved complex structure consisting of ten exons (Fig. 2). The first 5’ exon included a nontranslated region of 32 bp and a translated region of 200 bp, with a 75 bp initial segment encoding a putative signal peptide of 25 amino acids. Codons for disulfide isomerase catalytic
sites (CGHC) were located in the second exon, starting from the first nucleotide, and in the
10
ninth exon, starting from the 26th nucleotide. Codons for a potential N-glicosilation site
(NSF) were in the sixth exon starting from the 15th nucleotide. The tenth exon included a
non-translated region (171 bp for GPDI-4B and 168 bp for GPDI-4A and GPDI-4D) and a
translated region (216 bp for GPDI-4B and 225 bp for GPDI-4A and GPDI-4D). Moreover,
the three genes showed a consensus sequence for the tetrapeptide KDEL for protein retention
within the ER, at the 3’ translated end (Ciaffi et al. 2005).
The Open Reading Frames (ORFs) of the PDI genes located in chromosomes 4A and
4D consisted of 1545 bp, corresponding to polypeptides of 515 amino acids, with an estimated molecular weight of 56.6 kDa, whereas that on chromosome 4B was shorter, with a length
of 1536 bp, corresponding to 512 amino acids, and an estimated molecular weight of 56.3
kDa. The three deduced protein sequences were rich in acidic residues and had 4.7 pI. The
nucleotide sequences of the three ORFs showed an identity ranging between 96.0 and 97.5%.
Only 14 of the 66 nucleotide substitutions in the coding regions caused amino acid changes,
with six of them able to modify the physico-chemical features of the mature protein (Ciaffi et
al. 2005).
GPDI-4B and GPDI-4D showed high identity (94.5%), whereas GPDI-4A had the
same identity (92.0 %) both with GPDI-4B and GPDI-4D. Identity between introns was 8991% and that between exons 94-100%. Exons showed single nucleotide substitutions only,
with the single deletion of nine nucleotides within the tenth exon of GPDI-4B, whereas the introns had more frequent base substitutions and insertions/deletions, which caused the different
length of the genomic sequences (Fig. 2). The genomic sequence on chromosome 4D (GPDI4D) of CS showed a very high identity (99.7%) with that of Aegilops tauschii (Johnson and
Bhave 2004). The most noteworthy differences were a 34 bp deletion at the end of the first intron and a two bp deletion in the eighth intron of GPDI-4D. The A genome PDI gene sequences of hexaploid and tetraploid species were very conserved, showing 99.6% identity
11
(Ciaffi et al. 2001). Comparisons of these genomic sequences with those of Arabidopsis and
rice showed a significant conservation of the exon/intron structure and exon size across the
three species, most probably due to a strong relationship between the domain organization of
the encoded proteins and the genomic structure of the corresponding genes (Ciaffi et al.
2005).
Although the deduced amino acid sequence of three wheat PDI genes exhibited an
overall identity of only 31% to that of the human PDI, their modular architectures in terms of
number, size, location and secondary structure-propensities of the constituent domains are remarkably similar. Sequence homologies, both internally and to the human PDI, indicated that
proteins encoded by the three genes are composed of four major regions, corresponding almost
exactly to the a, b, b’ and a’ domains of the human PDI (Fig. 1). Secondary structure analysis
revealed that the a and a’ domains of wheat PDIs, which are homologous both to each other
(43% identity) and to thioredoxin, adopted a thioredoxin-like folding. However, both the putative b and b’ domains of the wheat proteins had a folding pattern very similar to that of the a
and a’ modules, although the extent of sequence identity between the b and b’ regions was not
adequate to consider them as internal repeats; moreover, no sequence homology was detected
between them and any thioredoxin or thioredoxin-like domains. Although specific structural
studies would be necessary to recognize the domain boundaries and to define their structure
unambiguously, the proposed multidomain structure of wheat PDI would suggest that in
plants, as in mammalians, the PDI domains may stem from partial gene duplication of a common thioredoxin ancestral gene, followed by sequence divergence. Probably those domains
arose before the appearance of most eukaryotic species, because homologous PDI sequences
are present in eukaryotes as diverse as fungi, insects, mammalians and plants (Freedman et al.
1994; Sahrawy et al. 1996; Ferrari and Soling 1999). All eukaryotic PDIs have recognisable a
and a’ modules, whereas the putative b and b’ modules have diverged to such extent within
12
and between species that their homology is doubtful, however the corresponding segments
contain always approximately the same number of amino acids and retain some elements of
the thioredoxin folding.
As far as the promoter sequences of the three homoeologous genes are concerned,
Ciaffi et al. (2005) cloned the upstream region of the translation start codon of every one of
them. Their length was 1352 bp for PromPDI-4A, 1370 bp for PromPDI-4B and 1292 bp for
PromPDI-4D. The three sequences showed 89% identity, high degree of conservation in the
700 nt proximal sequence, with identity exceeding 93%, and low in the distal region, with
about 80% identity. Differences were due to both nucleotide substitutions and short insertions/deletions. Every promoter showed a TATA-box located at –79 nt from the start codon
and several CAAT boxes. They had a number of different cis-acting conserved regulatory elements (Table 1), including several motifs (AACA, GCN4, prolamin box, RY element, Skn-1)
involved in the regulation of endosperm specific genes (Guilfoyle 1997; Albani et al. 1997;
Wu et al. 1998).
Expression analysis of the three CS PDI genes, performed by RT-PCR on mRNAs extracted from roots, coleoptiles, spikelets, leaves and developing caryopses collected at short
intervals between 6 and 34 days after anthesis (DAA), indicated that the three genes were
constitutively present in all the tissues tested, with a very strong expression in immature caryopses, where transcription levels were quite similar (Fig. 3). The transcription levels of the
three genes were higher in the early stage of seed development (6-14 DAA) and decreased
during middle to late stage of grain filling (18-34 DAA). The lowest level of trancripts was
observed for all the three genes in coleoptiles , whereas differences in their expression were
detected in spikelets, roots and leaves (Fig. 4). CPDI-4A transcription was higher in spikelets,
that of CPDI-4B was higher in roots, the CPDI-4D transcripts were more abundant in leaves.
13
It is noteworthy that no differences in the expression of the three genes were detected at different stages of caryopses development (Fig. 3).
4.3. Identification of novel wheat PDI-like gene sequences
A search in the HarvEST Wheat database allowed Ciaffi et al. (in preparation) to identify several sequences containing ORFs with significant similarity to the coding regions of
genes assigned to five of the eight PDI phylogenetic groups identified in the Arabidopsis genome (Houston et al. 2005). Among them, 18 wheat ESTs corresponded to the full length PDI
sequences of Arabidobsis belonging to the fifth subfamily, and 15 ESTs to the second subfamily, both characterised for having two thioredoxin-like active domains and structural similarities to different PDI-like proteins in higher eucaryotes, whereas two distinct groups of EST
sequences covered only part of the coding region of the fourth and seventh subfamilies. Finally, several ESTs showed significant homology with Arabidopsis sequences of the first phylogenetic group, whose members, as reported previously, have been cloned and extensively
characterized in wheat. Full length (II and V groups) and partial (IV and VII groups) cDNA
sequences were generated by RT-PCR from mRNAs of immature caryopses of CS and
cloned. Successively, the validity of the full length (WHEPDI-3 and WHEPDI-4) and partial
(WHEPDI-2 and WHEPDI-5) cDNA clones was checked and confirmed by sequence analysis
and comparison. Full-length cDNA sequences were obtained by 5’ and 3’ RACE extension
for the two incomplete sequences WHEPDI-2 and WHEPDI-5.
WHEPDI-2 exhibited only 19.4 % identity with the typical wheat PDI of the first
group (WHEPDI-1), it was shorter, had less amino acids separating the pair of thioredoxin active domains and there was a C-terminal -helical domain of about 100 aa, termed D domain,
whose function is unknown (Fig. 5). In spite of the presence of a potential ER-translocation
signal, this protein lacks an ER-retention signal, suggesting that it might be targeted to a dif14
ferent subcellular location or could be retained as part of a heteromeric complex with other
subunits containing such signal.
WHEPDI-3 exibited only 19.6 and 18.8% identity with WHEPDI-1 and WHEPDI-2,
respectively. This protein includes 440 amino acid residues, the two thioredoxin active domains are separated by 32 amino acids, and the putative protein contains the C-terminal
KDEL signal for ER retention. The deduced aa sequence of WHEPDI-3, as the other plant
members of the fifth subfamily, is tightly related to the mammalian P5 PDI-like proteins in
terms of sequence homology (about 40% identity), position of thioredoxin domains and size
of polypeptides.
The aa sequence of WHEPDI-4 showed 27.4, 18 and 14% identity with WHEPDI-1,
WHEPDI-2 and WHEPDI-3, respectively; it is the largest (585 aa) among the PDI-like proteins identified in wheat. WHEPDI-4 is characterised by the presence of a domain, located at
the N-terminus of the mature protein, that contains 40% of acidic residues (E+D). Remarkably, this domain is reminiscent of the c domain found close to the C-terminus of typical PDI
from mammalians and to the N-terminus of homologs of ERP72 (Ferrari and Soling, 1999). In
mammalians this domain is a putative low-affinity, high capacity calcium-binding site. The
deduced amino acid sequence of WHEPDI-4 is also characterized by the presence of a signal
peptide of 20 aa, of two thioredoxin active domains separated by 234 aa and of the C-terminal
signal KDEL for ER retention.
Finally, WHEPDI5 deduced aa sequence is characterised by the presence of a single
thioredoxin active domain, of a transmembrane segment, and by the absence of the KDEL
signal. Recently, proteins with similar structure have been identified in man, Drosophila and
C. elegans (Clissold and Bicknell 2003; Wilkinson and Gilbert 2004, Ellgaard and Ruddock
2005).
15
5. CONCLUSIONS
The multigenic family, comprising PDI and PDI-like proteins, accomplishes manifold
metabolic functions. Their role has been shown by many studies, mostly in mammalians,
whereas in plants the knowledge on structural and functional features of this group of proteins
and on their encoding genes is much less extensive. As for their involvement in determining
the technological properties of wheat flour, the study of the gene family encoding these proteins in wheat is important from the applied viewpoint, but it would be very important also for
understanding the molecular evolution of this multigenic family in a polyploid context. Up to
now all studies on molecular characterization and transcriptional regulation were exclusively
focused on wheat typical PDI; the limited available information show that in hexaploid wheat
it is coded by three homoeologous genes located in the group 4 chromosomes, that the coding
nucleotide sequences, exon/intron structures and regulatory sequences of these genes are well
conserved, and all functional. Likely, the high evolutionary conservation reflects the important functional role of their product in protein folding. The negligible differences between
their cDNAs at the level of both nucleotide and deduced amino acid sequences do not support
a functional differentiation of their gene products, as only six of the 14 replaced amino acids
may modify the physico-chemical features of the mature proteins. The observation that the
expression of the three homoeologous genes is similar and much higher in immature caryopses than in other wheat tissues is consistent with the assumption that the quality control system involving PDI is up-regulated in tissues wherein it takes place abundant synthesis of secretory proteins, such as the wheat endosperm. The higher amount of PDI transcripts of the
three homoeologous genes detected in developing endosperm is coherent with the presence in
their promoter regions of several conserved motifs, which have been shown as being involved
in the regulation of genes preferentially expressed in the endosperm. Differences observed in
the level of transcripts of the three genes in spikelets, roots and leaves would suggest a differ16
ential regulation of transcription rates of the three wheat PDI genes. Future studies should focus on the functional analysis of the promoter regions of the three PDI genes to elucidate the
mechanism controlling their spatial and temporal specific expression and the role of the single
regulatory motifs. In particular, expression analysis by reporter genes of progressive deletions
in their distal ends and/or base substitutions within putative consensus sequences would be
helpful for identifying the cis-elements that contribute to the higher transcriptional levels observed in seeds and the differential expression in other tissues.
Despite the recent data on the complexity and diversity of the PDI gene family in
plants, there are still numerous unanswered questions concerning the cell location and physiological functions of the single PDI and PDI-like proteins. For each of them it will be necessary to determine whether they have overlapping and redundant or separate and specific target
substrates. Determining the enzymatic specificity of the plant PDI-like proteins and their capacity to act independently or by interacting with other proteins in a redox chain would be
important for understanding their role in production and/or isomerization of disulfide bridges,
as well as their physiological accomplishments. In wheat most members of the PDI family
have yet to be identified, since only five PDI-like genes have been isolated, as described previously. Further researches will be needed for elucidating the complexity and diversity of the
PDI-like proteins in wheat and for understanding their involvement and role in folding, assembly, transport and deposition of seed storage proteins. A possible approach for exploring
these aspects would be to modify the expression levels of the PDI and PDI-like genes in
wheat transgenic plants and to examine the resulting phenotype, with particular attention to
the processing and storage of proteins which pass through the secretory system and to their
direct and/or indirect relevance on the technological quality.
ACKNOWLEDGMENTS
17
This research was supported by the MIUR (Italian Ministry of Instruction, University
and Research), “FIRB” project (D.M. 199, 8/3/2001, Prot. RBNE01TYZF).
This paper is dedicated to G. T. Scarascia Mugnozza on the occasion of his 80th birthday.
6. REFERENCES
Albani D, Hammond-Kosack MCU, Smith C, Conlan S, Colot V, Holdsworth M and Bevan
MW (1997) The wheat transcriptional activator SPA: a seed-specific bZIP protein that
recognizes the GCN4-like motif in the bifactorial endosperm box of prolamin genes.
Plant Cell 9: 171-184
Bennet CF, Balcarek JM, Varricchio A and Crooke ST (1988) Molecular cloning and complete amino-acid sequence of form-I phosphoinositide-specific phospholipase C. Nature
334: 268-270.
Bulleid NJ and Freedman RB (1988a) Defective co-translational formation of disulphide
bonds in protein disulphide-isomerase-deficient microsomes. Nature 335: 649-651.
Bulleid NJ and Freedman RB (1988b) The trancription and translation in vitro of individual
cereal storage-protein genes from wheat (Triticum aestivum cv. Chinese Spring). Biochemical Journal 254: 805-810.
Bulleid NJ, Shewry PR and Freedman RB (1992) Exploring the structure and assembly of
wheat storage proteins using an in vitro trancription/translation system. In “Plant Protein
Engineering” (PR Shewry and S Gutteridge, eds) Cambridge University Press, Cambridge, UK pp 201-208.
Chen F and Hayes PM (1994) Nucleotide sequence and developmental expression of duplicated genes encoding protein disulfide isomerase in barley (Hordeum vulgare L.). Plant
Physiol 106: 1705-1706.
18
Cheng SY, Gong QH, Parkinson C, Robinson EA, Appella E, Merlino GT and Pastan I (1987)
The nucleotide sequences of a human cellular thyroid hormone binding protein present
in endoplasmic reticulum. J Biol Chem 262: 11221-11227.
Ciaffi M, Dominici L, Tanzarella OA and Porceddu E (1999) Chromosomal assignment of
gene sequences coding for protein disulphide isomerase (PDI) in wheat. Theor Appl
Genet 98: 405-410.
Ciaffi M, Dominici L, Umana E, Tanzarella OA and Porceddu E (2000) Restriction Fragment
Length Polymorphism (RFLP) for protein disulfide isomerase (PDI) gene sequences in
Triticum and Aegilops species. Theor Appl Genet 101: 220-226.
Ciaffi M, Paolacci AR, Dominici L, Tanzarella OA and Porceddu E (2001) Molecular characterization of gene sequences coding for protein disulfide isomerase (PDI) in durum wheat
(Triticum turgidum ssp. durum). Gene 265: 147-156.
Ciaffi M, Paolacci AR, d’Aloisio E, Tanzarella OA and Porceddu E (2005) Cloning and characterization of wheat PDI (Protein disulfide isomerase) homoeologous genes and promoter sequences. Gene (Accepted).
Clissold PM and Bicknell R (2003) The thioredoxin-like fold: hidden domains in protein disulfide isomerases and other chaperone proteins. BioEssays 25: 603-611.
Coughlan SJ, Hastings C and Winfrey RJ (1996) Molecular characterization of plant endoplasmic reticulum: Identification of protein disulfide-isomerase as the major reticuloplasmin. Eur J Biochem 235: 215-224.
Deleage G, Blanchet C and Geourjon C (1997). Protein structure prediction. Implication for
biologist. Biochemie 79(11): 681-686.
Denecke J, De Rycke R and Botterman J (1992) Plant and mammalian sorting signals for protein retention in the endoplasmic reticulum contain a conserved epitope. EMBO J 11:
2345-2355.
19
Denecke J (1996) Soluble endoplasmic reticulum resident proteins and their function in protein synthesis and transport. Plant Physiol Biochem 34: 197-205.
Devos KM, Dubcovsky J, Dvorak J, Chinoy CN and Gale MD (1995) Structural evolution of
wheat chromosomes 4A, 5A and 7B and its impact on recombination. Theor Appl Genet
91: 282-288.
Ellgaard L and Ruddock LW (2005) The human protein disulphide isomerase family: substrate
interactions and functional properties. EMBO reports 6: 28-32.
Fassio A and Sitia R (2002) Formation, isomerization and reduction of disulphide bonds during protein quality control in the endoplasmic reticulum. Histochem Cell Biol 117: 151157.
Ferrari DM and Soling HD (1999) The protein disulphide-isomerase family: unravelling a
string of folds. Biochem J 339: 1-10.
Frand AR, Cuozzo JW and Kaiser CA (2000) Pathways for protein disulphide bond formation. Trends Cell Biol 10(5): 203-210.
Freedman RB, Hirst TR and Tuite MF (1994) Protein disulphide isomerase: building bridges
in protein folding. Trends Biochem Sci 19: 331-336.
Grimwade B, Tatham AS, Freedman RB, Shewry PR and Napier JA (1996) Comparison of
the expression patterns of wheat gluten proteins and proteins involved in the secretory
pathway in developing caryopses of wheat. Plant Mol Biol 30(5): 1067-1073.
Guilfoyle TJ (1997) The structure of plant gene promoters. In: Setlow JK (ed) Genetic engineering vol 19, Plenum Press, New York, pp15-47.
Hayano T, Hirose M and Kikuchi M (1995) Protein disulfide isomerase lacking its isomerase
activity accelerates folding in the cell. FEBS Letters 377(3): 505-511.
20
Houston NL, Fan C, Xiang QY, Schulze JM, Jung R and Boston RS (2005) Phylogenetic
analyses identify 10 classes of the protein disulfide isomerase family in plants, including
single-domain protein disulfide isomerase-related proteins. Plant Physiol 137: 762-778.
Johnson JC and Bhave M (2004) Molecular characterisation of the protein disulphide isomerase genes of wheat. Plant Sci 167: 397-410.
Kemmink J, Darby NJ, Dijkstra K, Nilges M and Creighton TE (1997). The folding catalyst
protein disulfide isomerase is constructed of active and inactive thioredoxin modules.
Curr Biol 7(4): 239-245.
Kim CS, Hunter BG, Kraft J, Boston RS, Yans S, Jung R and Larkins BA (2004) A defective
signal peptide in a 19-kD alpha-zein protein causes the unfolded protein response and an
opaque endosperm phenotype in the maize De*-B30 mutant. Plant Physiol 134: 380387.
Li CP and Larkins BA (1996) Expression of protein disulfide isomerase is elevated in the endosperm of the maize floury-2 mutant. Plant Mol Biol 30: 873-882.
Livesley MA, Bulleid NJ and Bray CM (1992) Protein disulfide isomerase in germinating
wheat (Triticum aestivum) seed and loss of viability. Seed Sci Res 2: 97-103.
Lucero HA and Kaminer B (1999) The role of calcium on the activity of ER calcistorin/Protein disulfide isomerase and the significance of the C-terminal and its calcium
binding. A comparison with mammalian protein-disulfide isomerase. J Biol Chem
274(5): 3243-3251.
Lyles MM and Gilbert HF (1991) Catalysis of the oxidative folding of ribonuclease A by protein disulfide isomerase: dependence of the rate on the composition of the redox buffer.
Biochemistry 30(3): 613-619.
Noiva R and Lennarz WJ (1992) Protein disulfide isomerase. A multifunctional protein resident in the lumen of the endoplasmic reticulum. J Bio Chem 267(6): 3553-3556.
21
Pihlajaniemi T, Helaakoski T, Tasanen K, Myllyla R, Huhtala ML, Koivu JG and Kivirikko
KI (1987) Molecular cloning of the beta-subunit of human prolyl 4-hydroxylase. This
subunit and protein disulphide isomerase are products of the same gene. EMBO J 6:
643-649.
Roden LT, Miflin BJ and Freedman RB (1982) Protein disulphide isomerase is located in the
endoplasmic reticulum of developing wheat endosperm. FEBS Lett 138: 121-124.
Rubin R, Levanoy H and Galili G (1992) Evidence for the presence of two different types of
protein bodies in wheat endosperm. Plant Physiol 99: 718-724.
Sahrawy M, Hecht V, Lopez-Jaramillo J, Chueca A, Chartier Y and Meyer Y (1996). Intron
position as an evolutionary marker of thioredoxins and thioredoxin domains. J Mol Evol
42: 422-431.
Schwaller M, Wilkinson B and Gilbert HF (2003) Reduction-reoxidation cycles contribute to
catalysis of disulfide isomerisation by protein-disulfide isomerase. J Biol Chem 278(9):
7154-7159.
Shewry PR and Tatham AS (1997) Disulphide bonds in wheat gluten proteins. J Cereal Sci
25: 207-227.
Shewry PR, Halford NG, Tatham AS, Popineau Y, Lafiandra D and Belton PS (2003) The
high molecular weight subunits of wheat glutenin and their role in determining wheat
processing properties. Adv Food Nutr Res 45: 219-302.
Shimoni Y, Segal G., Zhu X and Galili G (1995a) Nucleotide sequence of a wheat cDNA encoding protein disulfide isomerase. Plant Physiol 107: 281.
Shimoni Y, Zhu X, Levanoy H, Segal G and Galili G (1995b) Purification, characterization,
and intracellular localization of glycosylated protein disulfide isomerase from wheat
grains. Plant Physiol 108: 327-335.
22
Shorrosh BS and Dixon RA (1991) Molecular cloning of a putative plant endomembrane protein resembling vertebrate protein disulfide-isomerase and a phosphatidylinositolspecific phospholinase. Proc Natl Acad Sci USA 88: 10941-10945.
Shorrosh BS and Dixon RA (1992) Molecular characterization and expression of an alfalfa
protein with sequence similarity to mammalian ERp72, a glucose-regulated endoplasmic
reticulum protein containing active site sequences of protein disulphide isomerase. Plant
J 2: 51-58.
Takemoto Y, Coughlan SJ, Okita TW, Hikaru S, Masahiro O and Tohihiro K (2002). The rice
mutant esp2 greatly accumulates the glutenin precursor and deletes the protein disulfide
isomerase. Plant Physiol 128: 1212-1222.
Tanaka S, Uehara T and Nomura Y (2000) Up-regulation of protein-disulfide-isomerase in response to hypoxia/brain ischemia and its protective effect apoptotic cell death. J Biol
Chem 275: 10388-10393.
Tasanen K, Parkkonen T, Chow LT, Kivirikko KI and Pihlajaniemi T (1988) Characterization
of the human gene for a polypeptide that acts both as the beta subunit of prolyl 4hydroxilase and as protein disulfide isomerase. J Biol Chem 263: 16218-16624.
Tsai B, Rodighiero C, Lencer WI and Rapoport TA (2001) Protein disulfide isomerase acts as
a redox-dependent chaperone to unfold cholera toxin. Cell 104: 937-948.
Tu BP and Weissman JS (2004) Oxidative protein folding in eukaryotes:mechanisms and
consequences. J Cell Biol 164(3): 341-346.
Turano C, Coppari S, Altieri F and Ferraro A (2002) Proteins of the PDI family: unpredicted
non-ER locations and functions. J Cell Physiol 193: 154-163.
Wetterau JR, Combs KA, Spinner SN and Joiner BG (1990) Protein disulfide isomerase is a
component of the microsomal triglyceride transfer protein complex. J Biol Chem
265(17): 9801-9807.
23
Wilkinson B and Gilbert HF (2004) Protein disulfide isomerase. Biochim Biophys Acta 1699:
35-44.
Wrobel R (1996) Expression of molecular chaperones in endoplasmic reticulum of maize endosperm. PhD thesis. North Carolina State University, Raleigh, NC.
Wu CY, Suzuki A, Washida H and Takaiwa F (1998) The GCN4 motif in a rice glutelin gene is
essential for endosperm-specific gene expression and is activated by Opaque-2 in transgenic rice plants. Plant J 14: 673-683.
Xu ZJ, Ueda K, Masuda K, Ono M and Inoue M (2002) Molecular characterization of a novel
protein disulfide isomerase in carrot. Gene 284: 225-231.
Yao Y, Zhou YC and Wang CC (1997) Both the isomerase and chaperone activities of protein
disulfide isomerase are required for the reactivation of reduced and denaturated acidic
phosholipase A2. EMBO J 16: 651-658.
24
Table 1. Main regulatory motifs found within the promoter sequences of the three wheat PDI
genes
Distance
from ATG
Sequencea
+
+
+
-
-226
-227
-226
-1117
-1135
-1105
TGAAAAGT
CGAAAAGT
CGAAAAGT
TGAAAAGT
TGAAAAGT
TGAAAAGT
4A
+
-1111
CAAC-
4B
+
-1129
4D
+
-1099
Cis-acting element involved in
seed specific expression of proteins
in legumes and cereals
4A
4B
4D
+
+
+
-815
-810
-804
CATGCATT
CATGCATT
CATGCATT
Cis-regulatory element involved
in seed specific expression
4A
4B
4D
4A
4B
4D
4A
4B
4D
4A
4B
4D
+
+
+
+
+
+
-
-356
-358
-357
-492
-493
-491
-484
-485
-483
-780
-775
-769
CGACTCA
CGAGTCA
TGAGTCA
CATGTCA
CATGTCA
CGTGTCA
CGTGTCA
CGTGTCA
CATGTGA
CGAGCCA
CGAGCCA
CGAGCCA
4A
4B
4D
+
+
+
-79
-79
-79
TATTAAA
TATTAAA
TATTAAA
4A
4B
4D
-
-480
-481
-479
Motif
Function
Chromosome
location
Prolamin box
Cis-acting element associated
with GCN4 in prolamin genes
4A
4B
4D
4A
4B
4D
AACA
CAATTTCG
Cis-acting element conserved in
rice glutelin genes and involved
CAACAAACTTCG
in endosperm specific expression
RY-element
GCN4
Strand
CAAC-
CATTTTCG
TATA box
Skn-1
Cis-acting regulatory element
for endosperm expression
Chromosome
location
ACGAC
ACGAC
ATGAC
Strand +
(number)
Strand (number)
CAAT box
Cis-acting element common in
promoter and enhancer regions
4A
4B
4D
12
14
12
12
17
15
GC-motif
Cis-acting element common in
promoter and enhancer regions
4A
4B
4D
6
7
10
11
11
10
a
Consensus sequences of the motifs are in bold, base substitutions are underlined.
25
Fig. 1. Comparison of the domain structure of human and wheat typical PDIs. The elements of
secondary structure, either determined for the a and b domain of the human PDI using
NMR techniques (PDB Id 1BJX) or predicted for the putative a, a’, b and b’ domains of
wheat CPDI-4A by the procedure of Deleage et al. (1997) are reported. Open boxes indicate residues present in  helices, whereas those delimited by black solid boxes indicate residues present in  strands.
Fig. 2. Intron-exon structures of the three group 4 homoeologous PDI genes. The open boxes
indicate exons and the solid black boxes denote introns, numbers represent exon and intron size (bp). The positions of the putative N-terminal signal peptide (SP), of the two
thioredoxin-like active sites (CGHC) and of the C-terminal KDEL signal sequence for
ER retention are also indicated.
Fig. 3. RT-PCR of the three PDI genes in developing caryopses collected between 6 and 34
DAA (days after anthesis). RT-PCR products were taken after 20 and 25 cycles of amplification and analysed in 1.2% agarose gels. The -tubulin (TUB) constitutive gene
was amplified as control. M: part of the DNA molecular weight marker XIV (Roche),
the most intense band is 500 bp in length.
Fig. 4. Expression analysis by RT-PCR of the three PDI genes in different wheat tissues (1:
roots; 2: seedlings; 3: spikelets; 4: leaves; 5: developing caryopses 10 DAA). a) Agarose
gel electrophoresis of PCR products after 20 and 25 cycles of amplification. The tubulin constitutive gene (TUB) was amplified as control. b) Southern blots of PCR
26
products after 18 and 23 cycles hybridised with probes represented by the cDNA sequences of the three homoeologous genes (CPDI-4A, CPDI-4B and CPDI-4D).
Fig. 5 - Domain structure of the deduced amino acid sequences of wheat PDI-like genes. The
position and length of the redox-active thioredoxin domains, of the acidic domain, of
the D domain, of the transmembrane segment and of the putative signal peptide (SP) are
indicated. Bars indicate the C-terminal KDEL signal sequence and the thioredoxin-like
active sites (CGHC). The analysis of the predicted protein sequences of the wheat PDIlike genes was carried out by searching for conserved motifs at the Pfam HMMs, InterPro and SMART databases.
27
Human
132
1 17 21
137
234
367 369
236
479 480
508
SP
CGHC
KDEL
CGHC
a
b
b’
a’
c
Wheat
1
25
39
149
152
252
372
382
491 492
515
SP
CGHC
a
-strands
CGHC
b
b’
-helices
Fig. 1
28
KDEL
a’
32
31
200
SP
90
200
138
126
173
189
285
359
684
118
120
92
105
90
104
138
126
160
674
118
120
100
105
CGHC
31
32
200
SP
93
CGHC
31
32
SP
189
285
350
189
285
332
90
104
126
173
138
617
CGHC
Fig. 2
29
118
120
101
105
113
144
CGHC
113
110
CGHC
113
114
CGHC
225
168
85
KDEL
216
171
86
KDEL
225
168
85
KDEL
30
DAA
6 10 14 18 22 26 30 34
DAA
M
6 10 14 18 22 26 30 34
PDI-4A
PDI-4B
PDI-4D
TUB
25 cycles
20 cycles
Fig. 3
31
M
Fig. 4
32
SP
Thioredoxin domain
Thioredoxin domain
a
a’
WHEPDI-1
40
SP
Thioredoxin domain
Thioredoxin domain
a°
a
WHEPDI-2
32
SP
383
150
CGHC
139 150
CGHC
258 272
Thioredoxin domain
a°
a
133
165
CGHC
CGHC
Acidic domain
SP E+D rich segment Thioredoxin domain
WHEPDI-4
440 aa
271
NDEL
Thioredoxin domain
99 102
SP
CGHC
Thioredoxin domain
WHEPDI-5
207
441
CGHC
Transmembrane
segment
414 aa
a
34
CGHC
140
585 aa
a’
a
35
KDEL
367 aa
CGHC
29
489
CGHC
D domain
Thioredoxin domain
WHEPDI-3
515 aa
378 400
Fig. 5
33
549
KDEL
Download