cDNA cloning, expression and chromosomal localization of the

advertisement
A genome-wide survey of human thioredoxin and
glutaredoxin family pseudogenes.
Giannis Spyrou1, William Wilson2, Carmen Alicia Padilla3, Arne Holmgren4
and Antonio Miranda-Vizuete1,5.
1Department
of Biosciences at Novum, Center for Biotechnology, Karolinska Institute, S-
141 57 Huddinge, Sweden, 2Department of Cell and Molecular Biology, Karolinska
Institute, S-17177 Stockholm, Sweden, 3Department of Biochemistry and Molecular
Biology, University of Córdoba, 14071 Córdoba, Spain and 4Department of Medical
Biochemistry and Biophysics, Karolinska Institute, S-17177 Stockholm, Sweden.
5Corresponding
Abbreviations:
author: Tel.: 46-8-6083338; Fax: 46-8-7745538; Email: anmi@biosci.ki.se
Grx,
glutaredoxin;
GR,
glutathione reductase;
GSH,
reduced
glutathione; MY, million years; ORF, open reading frame; PAPS, 3´-phosphoadenosine
5´-phosphosulfate; Trx, thioredoxin; TrxR, thioredoxin reductase; UTR, untranslated
region
Keywords: Thioredoxin, Glutaredoxin, Pseudogene, Retrotransposition.
Running title: Human thioredoxin and glutaredoxin pseudogenes.
Human Trx1 pseudogenes 1 to 7 GenBank accession numbers: AF146023, AF146024,
AF357530, AF357531, AF357532, AF357533 and AF357534.
1
Human Grx1 pseudogenes 1 and 2 GenBank accession numbers: AF227512 and
AF358259.
2
SUMMARY
The thioredoxin/glutaredoxin family consists of small, heat stable proteins with a
highly conserved CXXC active site which participate in the regulation of many redox
reactions. We have searched the human genome sequence to find putative pseudogenes
(non-functional copies of protein coding genes) for all known members of this family.
This survey resulted in the identification of seven processed pseudogenes for human
Trx1 and two more for human Grx1. No evidence for the presence of processed
pseudogenes was found for the remaining members of this family. In addition, we were
unable to detect any non-processed pseudogenes derived from any member of the
family in the human genome. The seven thioredoxin pseudogenes can be divided into
two different groups: Trx1-2, -3, -4, -5 and -6 arose from the functional ancestor,
while Trx1-1 and -7 originated from Trx1-2 and -6, respectively. In all cases, the
pseudogenes were originated after the human/rodent radiation as shown by
phylogenetic analysis. This is also the case for Grx1-1 and Grx1-2 which are placed
between rodent and human sequences in the phylogenetic tree. Our study provides a
molecular record of the recent evolution of these two genes in the hominid lineage.
3
INTRODUCTION
Thioredoxins (Trx) and glutaredoxins (Grx) comprise a family of ubiquitous small
heat-stable proteins that function as general protein-disulfide reductases by the
reversible oxidation of their respective active sites, CGPC for thioredoxin and CPYC or
CSYC for glutaredoxin (Holmgren 1989; Lundberg et al. 2001). Both proteins catalyze
different redox reactions at the expense of the reducing power of NADPH yet by
different pathways as shown in scheme 1:
TrxR
Trx
Prot
NADPH
GR
GSH
Grx
Prot
S
S
SH
SH
Scheme 1
Thus, the thioredoxin system transfers the electrons from NADPH to the
flavoenzyme thioredoxin reductase (TrxR) that, in turn, keeps thioredoxin in its active
reduced form. In contrast, in the glutaredoxin system the electrons are sequentially
transferred to glutathione reductase (GR), glutathione (GSH) and glutaredoxin
(Holmgren 1989). This model which has remained unaltered since the discovery of Trx
and Grx in E. coli has received an unexpected input with the striking discovery that in
4
Drosophila, glutathione is maintained in its reduced form by a thioredoxin system as the
fruitfly lacks the enzyme glutathione reductase (Kanzok et al. 2001).
Thioredoxin and glutaredoxin are present in all organisms investigated from
lower prokaryota to humans and are highly conserved through evolution. Many
different functions have been ascribed to thioredoxin and glutaredoxin since their
discovery as electron donors for the essential enzyme ribonucleotide reductase in E. coli
(Holmgren 1976; Laurent et al. 1964). Some other functions are also shared by these two
proteins, including electron donor for PAPS reductase, modulator of transcription
factor DNA binding activity or general antioxidant defense (Hirota et al. 2000;
Holmgren 1989; Holmgren 2000). However, there are roles that either Trx or Grx
specifically carry out. For instance, mammalian Trx1 is secreted by cells, is able to act as
co-cytokine or regulate apoptosis by direct interaction with ASK1 protein (Nakamura et
al. 1997; Saitoh et al. 1998) as well as specifically reducing methionine sulfoxide
reductase (Arner and Holmgren 2000). In bacteria, Trx1 is the only electron donor for
methionine sulfoxide reductase (similar to mammals) and participates in phage
assembly; whereas, Trx2 has been shown to maintain the reducing environment of E.coli
cytoplasm (Arner and Holmgren 2000; Stewart et al. 1998). Such functions are not
shared by any of the known glutaredoxins. On the other hand, only glutaredoxin is
known to display dehydroascorbic acid reductase activity, necessary for neutrophils
oxidative burst during infection (Park and Levine 1996) and to regulate HIV-1 protease
in vivo (Davis et al. 1997).
During the last years we have been deeply involved in the identification and
characterization of novel members of the thioredoxin/glutaredoxin family. This search
5
has resulted in a complicated picture as more than one thioredoxin or glutaredoxin
form can be found within the same organism. Thus, E. coli contain two thioredoxins and
three glutaredoxins (Åslund et al. 1994; Laurent et al. 1964; Miranda-Vizuete et al.
1997a); yeast has two thioredoxin systems, in cytosol and mitochondria respectively,
and two glutaredoxins (Grant 2001; Pedrajas et al. 1999). Photosynthetic organisms have
several thioredoxin systems in different cellular compartments including chloroplasts
and have been implicated in light energy transduction and response to oxidative stress
(Juttner et al. 2000; Mestres-Ortega and Meyer 1999) and more recently in control of
plant self-incompatibility response (Cabrillac et al. 2001). Finally, mammalian cells
have, at least, two thioredoxin and glutaredoxin systems located in the cytosol and
mitochondria respectively (Holmgren 1989; Lundberg et al. 2001; Miranda-Vizuete et al.
2000) and other thioredoxin-like proteins have been identified (Miranda-Vizuete et al.
1998) and unpublished results). In addition, we have recently characterized the first
member of the thioredoxin family with a tissue-specific distribution in human
spermatozoa (Miranda-Vizuete et al. 2001).
In order to complete the study of all the sequences originated from human
thioredoxin and glutaredoxin genes we report here the outcome of a survey for putative
pseudogenes of all known members of this family in the human genome. This search
resulted in the identification of seven human Trx1 and two Grx1 pseudogenes, all of
them falling into the category of processed pseudogenes.
6
MATERIALS AND METHODS
Sequence identification and chromosomal localization. All known human
thioredoxin and glutaredoxin protein sequences were compared with the nonredundant and high-throughput data set of human genome draft (in six-frame
translation)
at
NCBI
using
the
TBLASTN
search
(http://www.ncbi.nlm.nih.gov/BLAST/). TBLAST hits showing at least 80% identity
and with e-values  0.001 along the whole protein sequence were considered as
sequence match and selected for further analysis. All the identified sequences were
searched
against
the
UniSTS
database
(http://www.ncbi.nlm.nih.gov/genome/sts/epcr.cgi), which integrates marker and
mapping data from public resources, to assign them their corresponding interval
chromosomal markers. With this information we mapped their chromosomal
localization
under
the
option
of
http://www.ncbi.nlm.nih.gov/genome/guide/human,
physical
using
the
maps
GenBank
at
and
ideogram option for display settings.
Sequence alignment and phylogenetic analysis. The DNA sequences were
aligned using the Clustal W program (Thompson et al. 1994). The phylogenetic analysis
was produced applying the Neighbour-Joining method of Saitou and Nei to the
alignment data (Saitou and Nei 1987). Both, Clustal W alignment and Neighbor-Joining
algorithm were implemented using the Megalign program included in the DNASTAR
Software Package (DNASTAR Inc., Madison, WI). Statistical support for nodes of the
7
Neighbor-Joining trees was assessed using the 50% majority–rule consensus trees
compiled from 1000 bootstrap replications (Felsenstein 1985), implemented with the
NJPlot
Program
available
at
http://www.biom3.univ-
lyon1.fr/pub/mol_phylogeny/njplot/. To root the tree we used the human Sptrx
thioredoxin domain and the human mitochondrial Grx2 mature form (resulting after
cleavage of the mitochondrial targeting sequence), respectively (Lundberg et al. 2001;
Miranda-Vizuete et al. 2001). The degree of divergence (or evolutionary distance)
between the pseudogenes and its functional homologue was estimated by the number
of transitions and transversions per site between the two sequences using the JukesCantor correction for mutation saturation formula d=-(3/4) ln (1-(4/3)p) where p is the
number of substitutions (Graur and Li, 2000). The sequence of the processed
pseudogene used for these calculations was the corresponding to the ORF of the
functional gene. The time of the retrotransposition event was calculated assuming a
constant mutation rate of 1 x 10-9 per site/year (Nachman and Crowell 2000).
8
RESULTS AND DISCUSSION
The human genome draft has been recently published in a joint effort from both
public and private initiatives. The first surprise came when determining the total
number of genes as the initial estimation of 70,000 to 100,000 genes (Fields et al. 1994)
has dramatically dropped to 30,000-40,000 protein coding genes, only about twice as
many as in C. elegans or Drosophila. However, alternative splicing of a vast majority of
these genes is expected to increase the number of functional protein products (Lander et
al. 2001; Venter et al. 2001). In humans, coding information comprises less than 5% of
the genome whereas more than 50% is embraced by repetitive sequences, most of them
derived from transposable elements (Lander et al. 2001). The availability of the human
genome sequence provides an invaluable tool for discovering new genes. With the
identification of the complete set of human genes and proteins, we will radically
transform our understanding of all aspects of human biology and medicine. This
approach is nevertheless hampered by the presence within the genome of inactive
sequences, the so-called pseudogenes, with high identity to one or more paralogous
functional genes. Pseudogenes are a small but significant part of the genomic repetitive
sequences and are present in all groups of living organisms. Pseudogenes arise from a
functional gene by two different mechanisms: by retrotransposition from its mRNA,
generating a processed pseudogene as they lack introns or by genomic DNA
duplication resulting in a non-processed pseudogene which retains the genomic
organization of the ancestral gene (Mighell et al. 2000). In both cases, the inactivation of
the new copy is considered to occur shortly after integration in the genome. Although
9
most pseudogenes are inactive due to inability to be transcribed, there is a growing
number of reported pseudogenes that are transcribed (driven by nearby promoter-like
sequences) but no evidence of translated products has been found so far. The functional
relevance of these transcripts remains unclear (Brosius 1999).
To gain more insight into the thioredoxin/glutaredoxin field we have conducted a
survey in the human genome sequence to identify the putative pseudogenes for any
known member of the thioredoxin/glutaredoxin family. The discrimination between
functional and non-functional copies of different members of a protein family based
solely on sequence data must be undertaken carefully as it is possible that the
expression analysis might not have been performed in the appropriate cell, tissue or
developmental stage. We have assumed the following criteria to consider a given
genomic sequence a pseudogene:
a) High identity (higher than 80%, e-value  0.001) at the nucleotide level with the
paralogous gene.
b) Unanimity between the sequence obtained in human public (NCBI) and
private (Celera) databases.
c) Presence of coding disablement events like premature stop codons in frame,
deletions/insertions leading to frameshifts, removal of the starting methionine
or active site and insertion of repetitive sequences.
d) No matches of the sequence with any entry in the expressed sequence tag (EST)
database.
10
With these premises we initiated our study using the protein sequence of each
known member of the human thioredoxin /glutaredoxin family to search for sequences
with homology at the DNA level in the human genome draft in both non-redundant
and high-throughput human databases. Using this method we identified several
homologous sequences. The first filtering step was to remove from this set all sequences
that coded for any of the known functional genes. The remaining sequences were
considered as either potential pseudogenes or potential novel functional members of
the protein family. We rapidly ruled out the possibility that any of the selected
sequences would encode a functional novel member of the thioredoxin/glutaredoxin
family as all of them presented clear disablement features and no matches with the EST
database were found. Two pseudogenes for thioredoxin-1 and one more for
glutaredoxin-1 (Tonissen and Wells 1991; Miranda-Vizuete and Spyrou 2000; MirandaVizuete and Spyrou 2001) have been already reported and were present among the
selected sequences further supporting the search strategy. Therefore, we initially
hypothesized that all the sequences obtained by this approach were potential
pseudogenes.
Next, the selected sequences were aligned at the DNA level with the complete
mRNA of their respective functional genes. The alignment results indicated seven
potential pseudogenes for human Trx1 and two for Grx1 but none for the remaining
members of the family, Trx2 and Grx2 (both mitochondrial), Txl (cytosolic) and Sptrx
(sperm specific) (Lundberg et al. 2001; Miranda-Vizuete et al. 1998; Miranda-Vizuete et
al. 1997b; Miranda-Vizuete et al. 2001). It is important to note that all the sequences
matched their corresponding functional gene along its full mRNA sequence length
11
(except three cases of Alu insertions that when removed from the sequence also resulted
in full mRNA homology Trx1-2, -6, and -7 and one 5´-truncation Trx1-7 )
strongly supporting the idea that they might be processed pseudogenes. We confirmed
in the Celera database all disablement events and that no single EST entry matched
their sequences. Taken together, we concluded that the nine sequences identified were
processed pseudogenes.
Table 1 and Figure 1 summarize the features of the seven Trx1 pseudogenes
identified. Trx1-1 and Trx1-2 have been previously described (Miranda-Vizuete and
Spyrou 2000; Tonissen and Wells 1991) while Trx1-3 through Trx1-7 are new and
have been numbered in decreasing order of homology with the functional Trx1 gene. In
addition, to visualize the disablement differences between Trx1 gene and its processed
pseudogenes we have aligned their corresponding sequences with that of the Trx1 ORF
(Figure. 2). All Trx1 pseudogenes harbor many different types of disablement mutations
(including Alu sequence insertions) resulting in a sequence which, if translated, would
render a truncated defective protein.
Apart from the lack of introns and disablement mutations, processed pseudogenes
have other features such as an imperfect polyA tail at the point where the homology
with the functional gene ceases, flanking repeats thought to be involved in the
retrotransposition event and the absence of promoter regulatory regions (Mighell et al.
2000). We have also looked for these additional characteristics in the thioredoxin
pseudogenes. For example, an imperfect polyA tail or at least a stretch of adenosine
residues was found in all pseudogenes, immediately after the point where the
homology with Trx1 mRNA ceases (not shown). Moreover, the promoter regions
12
described in Trx1 gene (TATA box and SP1 site) (Kaghad et al. 1994) are absent in all
pseudogenes as homology finishes at different positions within Trx1 5´-UTR (except for
5´-truncation in Trx1-7). Another characteristic of processed pseudogenes is that they
usually map at different chromosomes than the functional paralogue, in contrast to
non-processed pseudogenes that arise from genomic duplication and therefore map
within the same chromosomal region. As shown in Table 1, Trx1 is located at
chromosome 9q31 while all Trx1 pseudogenes map in different chromosomes.
However, it has been difficult to identify the flanking repeats in some of the
pseudogenes which might be due to accumulation of additional mutations in this area,
although Trx1-1 and Trx1-2 repeats have been previously reported (Miranda-Vizuete
and Spyrou 2000; Tonissen and Wells 1991). Finally, it should be noted that only Trx17 presents a 5´-truncation, which is assumed to arise from abortive arrest during the
reverse transcription process (Ophir and Graur 1997). This truncation would remove
the first 44 residues of the putative protein plus any 5´-UTR sequence. Trx1-1 and
Trx1-4 also have small 5´-truncations of 47 and 35 bp respectively, compared with
Trx1 cDNA (Figure 1).
Regarding human Grx1 we have found two processed pseudogenes, one of which
has been already reported (Miranda-Vizuete and Spyrou 2001). Before describing the
characterization of these Grx1 pseudogenes it is important to comment on the presence
of different Grx1 mRNA sequences. To date, two entries for the complete Grx1 cDNA
sequence have been deposited in the NCBI database: a longer form of 1328 nucleotides
reported by Park and Levine (NM_002064 originally derived from entry AF069668)
(Park and Levine 1996), and a shorter form of 837 nucleotides reported by Padilla et al.
13
(X76648) (Padilla et al. 1995). Both forms differ only in the length of their respective 3´UTRs. A comparison of both sequences with the genomic clone AC008821, which
contains the complete human glutaredoxin gene, allowed us to determine the nature of
this discrepancy that results from the use of two distinct splicing acceptor sites for the
third exon, thus explaining the differences between the two forms as the intron2/exon3
boundary lies within the 3´-UTR (Figure 3). This finding is in agreement with the
genomic sequence for the human Grx1 gene previously reported (Park and Levine 1997)
although the absence of the intronic sequences in this paper would have not allowed
this conclusion. A search using both Grx1 cDNA sequences in the EST database (which
can be used as an estimation of the relative abundance of any given mRNA) found that
the two forms are expressed although a much higher proportion of the sequences are
derived from the shorter form (data not shown).
Despite its lower expression, both human Grx1 pseudogenes Grx1-1 and Grx12 clearly derive from the longer form. Similarly to Trx1 pseudogenes, Grx1-1 and
Grx1-2 are processed pseudogenes with multiple disablement features including
frameshifts, removal of active site, stop codons in frame, etc. (Table 2 and Figure 4).
They also have in common with Trx1 pseudogenes the lack of any regulatory sequence
similar to the ones described for the functional gene (Park and Levine 1997), the
presence of an imperfect polyA tail and a chromosomal localization at different
positions from the functional paralogue (Table 2). Indeed, a southern blot analysis
performed on human genomic DNA identified not only the functional Grx1 gene at
chromosome 5 but also other Grx-like sequences in chromosomes 12 and 14 (Padilla et
14
al. 1996). Grx1-2 maps at position 14q32.13-32.2 suggesting that it might be one of
these.
The presence of pseudogenes within the human genome should be viewed in a
wider context of a dynamic evolutionary process driven by retrotransposition. As
previously mentioned, approximately half of the human genome consists of repeated
sequences of various types with most of them arising from reverse transcription from
RNA (Baltimore 2001; Lander et al. 2001). Retrotransposition does not always generate
non-functional pseudogenes, but also new functional forms that evolve differently from
the ancestral gene after they have integrated in the genome and become
transcriptionally active (Brosius 1999). Actually, the thioredoxin family has such an
example in the intron-less Sptrx gene that codes for a sperm-specific protein (MirandaVizuete et al. 2001).
Pseudogenes are important for evolutionary studies because their pattern of
mutation is assumed to be neutral as they are not subject to functional constraints
(Harrison et al. 2001). This has allowed the calculation of the rate of nucleotide
substitutions, insertions and deletions in genomic DNA (Gojobori et al. 1982; Nachman
and Crowell 2000; Ophir and Graur 1997) or the determination of the rates of genomic
loss for an organism (Petrov et al. 2000). To gain more insight into the evolution of the
thioredoxin/glutaredoxin genes in the mammalian lineage we performed a
phylogenetic analysis to determine the origin of the identified pseudogenes. Figure 5A
shows the phylogenetic tree of all human Trx1 pseudogenes that arose after
hominid/rodent radiation. From this analysis two different Trx1 pseudogene groups
can be distinguished: group 1 includes Trx1-2, -3, -4, -5 and -6 which arose from
15
the functional hominid Trx1 gene, and a second group composed of Trx1-1 and -7
which originated from Trx1-2 and -6 pseudogenes, respectively. This is inferred from
the position of the par Trx1-1/-2 and Trx1-6/-7 in the same tree branch and the
shorter 5´-sequences of Trx1-1 and Trx1-7. We can also conclude that the Alu
insertion in Trx1-2 occurred after the duplication event.
The phylogenetic analysis for Grx1 pseudogenes is simpler as both Grx1-1 and
Grx1-2 arose from the longer form of Grx1 mRNA by a retrotransposition event
(Figure 5B). There is an apparent contradiction regarding the position of the pig Grx1
gene which would be expected to be located between human and rodent genes. The pig
Grx1 protein displays important changes in its sequence compared with the rest of the
mammalian glutaredoxins sequenced to date, including a residue change at the active
site, which might explain its anomalous tree location (Yang et al. 1989).
The age of a processed pseudogene (time since retrotransposition) can be
estimated by the divergence between its sequence with that of the functional gene,
measured as nucleotide substitutions assuming a constant rate of 1 x 10 -9 substitutions
per site per year (Nachman and Crowell 2000). Figure 5 also shows the evolutionary
distance between the human pseudogenes and their corresponding functional
paralogues. For human Trx1 pseudogenes the divergence score matches well with the
phylogenetic analysis, but Trx1-5, -6 and -7 have a significant higher score than the
rest of the thioredoxin pseudogenes and the human/rodent radiation estimated to have
occurred about 85 MY (Friedberg and Rhoads 2000). This can be explained by the
involvement of additional factors others than evolutionary time, such as locus insertion
which could have had a significant influence in their evolution (Mighell et al. 2000;
16
Petrov and Hartl 2000). Thus, pseudogene sequences may have detrimental position
effects on the expression of nearby genes or they may be mutagenic through
homologous DNA interactions with their functional counterparts (Wu and Morris
1999). In these cases, more drastic mutations in the pseudogene may be selectively
advantageous, as they are more likely to disrupt the deleterious activity of the
pseudogene (Petrov and Hartl 2000). Human Grx1 pseudogenes have the same
divergence values indicating that they might have originated closely from the long form
of human Grx1 mRNA during evolution. It has been proposed that only elements that
were generated less than approximately 150 MY are still discernible, in good agreement
with our data (Brosius 1999).
The fact that we have only identified processed pseudogenes in the human
thioredoxin/glutaredoxin family follows the general pattern of the majority of genes
with associated pseudogenes. Processed pseudogenes are generated by mRNA copies
that lack promoter sequences so its integration is doomed to transcriptional silence
unless they insert into a locus that supports its transcription. In contrast, non-processed
pseudogenes arise from gene duplication events which are likely to include not only
coding sequences but also regulatory elements (Brosius 1999). Therefore, there is a
much higher chance that these gene duplications might lead to novel active genes than
inactive pseudogenes thus explaining the lower number of non-processed pseudogenes
in any given family including the thioredoxin/glutaredoxin family.
A recent study has pinpointed the major features for the genes that give rise to
processed pseudogenes: widely expressed, highly conserved, short and GC-poor
(Goncalves et al. 2000) characteristics that match both Trx1 and Grx1 genes. However,
17
this argument is not enough to explain why the rest of the thioredoxin/glutaredoxin
family members lack associated pseudogenes as they also fulfil most of these criteria.
The most likely explanation is that for unknown reasons the Trx1 and Grx1 loci are
extremely active, generating copies of themselves and dispersing them all over the
genome. Only in few cases has one of these copies become transcriptionally active,
evolving differently from the ancestral gene but losing its capacity of generating further
copies. Alternatively, Trx1 and Grx1 genes may have been the ancestor module which
generated novel forms of thioredoxin-like proteins as throughout evolution, new
adapted forms of these proteins were needed to fulfil new functions, e.g. internal
fertilization (Sptrx-1 is a sperm-specific thioredoxin).
In conclusion, we have identified several putative pseudogenes derived from the
known members of the human thioredoxin/glutaredoxin family of proteins. In
addition, we have described an alternative splice acceptor site in Grx1 gene which
explains the existence of two forms of Grx1 mRNA differing in the length of their
respective 3´-UTRs. The human genome sequence has just been released and it will be
constantly updated during the upcoming years. Therefore, it is possible that revised
versions might slightly modify the data reported here although the identification of
novel pseudogenes is unlikely considering the homology criteria used in this study. The
availability of the information reported in this work will definitively help in the
identification of novel thioredoxins and glutaredoxins facilitating a rapid discrimination
between functional and non-functional paralogues as well as providing the basis for an
evolutionary record when the new mammalian genomes are sequenced.
18
ACKNOWLEDGEMENTS
We thank Drs. Peter Zaphiropoulos, Christine Sadek and Paula Cunnea for fruitful
discussions and critical reading of the manuscript and Dr. Eva Enmark for advice on
Clustal W program. We also thank the two anonymous referees for helpful comments
and suggestions. This work was supported by grants from the Swedish Medical
Research Council (Projects 03P-14096-01A, 03X-14041-01A and 13X-10370), the Swedish
Cancer Society (961) and the Karolinska Institutet.
19
REFERENCES
Arner ES, Holmgren A (2000) Physiological functions of thioredoxin and thioredoxin
reductase. Eur J Biochem 267: 6102-6109
Åslund F, Ehn B, Miranda-Vizuete A, Pueyo C, Holmgren A (1994) Two additional
glutaredoxins exist in Escherichia coli: glutaredoxin 3 is a hydrogen donor for
ribonucleotide reductase in a thioredoxin/glutaredoxin 1 double mutant. Proc
Natl Acad Sci U S A 91: 9813-9817
Baltimore D (2001) Our genome unveiled. Nature 409: 814-816
Brosius J (1999) RNAs from all categories generate retrosequences that may be exapted
as novel genes or regulatory elements. Gene 238: 115-134
Cabrillac D, Cock JM, Dumas C, Gaude T (2001) The S-locus receptor kinase is inhibited
by thioredoxins and activated by pollen coat proteins. Nature 410: 220-223
Davis DA, Newcomb FM, Starke DW, Ott DE, Mieyal JJ, Yarchoan R (1997)
Thioltransferase (glutaredoxin) is detected within HIV-1 and can regulate the
activity of glutathionylated HIV-1 protease in vitro. J Biol Chem 272: 25935-25940
Felsenstein J (1985) Confidence limits on phylogenies: an approach using the bootstrap.
Evolution 39: 783-791
Fields C, Adams MD, White O, Venter JC (1994) How many genes in the human
genome? Nat Genet 7: 345-346
Friedberg F, Rhoads AR (2000) Calculation and verification of the ages of
retroprocessed pseudogenes. Mol Phylogenet Evol 16: 127-130
20
Gojobori T, Li WH, Graur D (1982) Patterns of nucleotide substitution in pseudogenes
and functional genes. J Mol Evol 18: 360-369
Goncalves I, Duret L, Mouchiroud D (2000) Nature and structure of human genes that
generate retropseudogenes. Genome Res 10: 672-678
Grant CM (2001) MicroReview: Role of the glutathione/glutaredoxin and thioredoxin
systems in yeast growth and response to stress conditions. Mol Microbiol 39: 533541
Graur D, W-H Li (2000) Fundamentals of Molecular Evolution, 2nd edn. Sinauer Inc.,
Massachusetts.
Harrison PM, Echols N, Gerstein MB (2001) Digging for dead genes: an analysis of the
characteristics of the pseudogene population in the Caenorhabditis elegans
genome. Nucleic Acids Res 29: 818-830
Hirota K, Matsui M, Murata M, Takashima Y, Cheng FS, Itoh T, Fukuda K, Junji Y
(2000) Nucleoredoxin, glutaredoxin, and thioredoxin differentially regulate NFkappaB, AP-1, and CREB activation in HEK293 cells. Biochem Biophys Res
Commun 274: 177-182
Holmgren A (1976) Hydrogen donor system for E. coli ribonucleotide diphosphate
reductase dependent upon glutathione. Proc. Natl. Acad. Sci. USA 73: 2275-2279.
Holmgren A (1989) Thioredoxin and glutaredoxin systems. J. Biol. Chem. 264: 1396313966
Holmgren A (2000) Antioxidant function of thioredoxin and glutaredoxin systems.
Antioxid Redox Signal 2: 811-820
21
Juttner J, Olde D, Langridge P, Baumann U (2000) Cloning and expression of a distinct
subclass of plant thioredoxins. Eur J Biochem 267: 7109-7117
Kaghad M, Dessarps F, Jacquemin-Sablon H, Caput D, Fradizeli D, Wollman EE (1994)
Genomic cloning of human thioredoxin-encoding gene: mapping of the
transcription start point and analysis of the promoter. Gene 140: 273-278
Kanzok SM, Fechner A, Bauer H, Ulschmid JK, Muller HM, Botella-Munoz J,
Schneuwly S, Schirmer RH, Becker K (2001) Substitution of the Thioredoxin
System for Glutathione Reductase in Drosophila melanogaster. Science 291: 643646
Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K,
Doyle M, FitzHugh W, Funke R, Gage D, Harris K, Heaford A, Howland J, Kann
L, Lehoczky J, LeVine R, McEwan P, McKernan K, Meldrim J, Mesirov JP, Miranda
C, Morris W, Naylor J, Raymond C, Rosetti M, Santos R, Sheridan A, Sougnez C,
Stange-Thomann N, Stojanovic N, Subramanian A, Wyman D, Rogers J, Sulston J,
Ainscough R, Beck S, Bentley D, Burton J, Clee C, Carter N, Coulson A, Deadman
R, Deloukas P, Dunham A, Dunham I, Durbin R, French L, Grafham D, Gregory S,
Hubbard T, Humphray S, Hunt A, Jones M, Lloyd C, McMurray A, Matthews L,
Mercer S, Milne S, Mullikin JC, Mungall A, Plumb R, Ross M, Shownkeen R, Sims
S, Waterston RH, Wilson RK, Hillier LW, McPherson JD, Marra MA, Mardis ER,
Fulton LA, Chinwalla AT, Pepin KH, Gish WR, Chissoe SL, Wendl MC,
Delehaunty KD, Miner TL, Delehaunty A, Kramer JB, Cook LL, Fulton RS,
Johnson DL, Minx PJ, Clifton SW, Hawkins T, Branscomb E, Predki P, Richardson
P, Wenning S, Slezak T, Doggett N, Cheng JF, Olsen A, Lucas S, Elkin C,
22
Uberbacher E, Frazier M, et al. (2001) Initial sequencing and analysis of the human
genome. International Human Genome Sequencing Consortium. Nature 409: 860921
Laurent TC, Moore EC, Reichard P (1964) Enzymatic synthesis of deoxyribonucleotides
IV. Isolation and characterization of thioredoxin, the hydrogen donor from E. coli
B. J. Biol. Chem. 239: 3436-3444
Lundberg M, Johansson C, Chandra J, Enoksson M, Jacobsson G, Ljung J, Johansson M,
Holmgren A (2001) Cloning and expression of a novel human glutaredoxin(Grx2)
with mitochondrial and nuclear isoforms. J. Biol. Chem. 276: 26269-26275
Mestres-Ortega D, Meyer Y (1999) The Arabidopsis thaliana genome encodes at least
four thioredoxins m and a new prokaryotic-like thioredoxin. Gene 240: 307-316
Mighell AJ, Smith NR, Robinson PA, Markham AF (2000) Vertebrate pseudogenes.
FEBS Lett 468: 109-114
Miranda-Vizuete A, Damdimopoulos AE, Gustafsson J-A, Spyrou G (1997a) Cloning,
expression and characterization of a novel Escherichia coli thioredoxin. J. Biol.
Chem. 272: 30841-30847
Miranda-Vizuete A, Damdimopoulos AE, Spyrou G (2000) The mitochondrial
thioredoxin system. Antioxid Redox Signal 2: 801-810
Miranda-Vizuete A, Gustafsson J-A, Spyrou G (1998) Molecular cloning and expression
of a cDNA encoding a human thioredoxin-like protein. Biochem. Biophys. Res.
Commun 243: 284-288
Miranda-Vizuete A, Gustafsson J-A, Spyrou G (1997b) NCBI database entry no.
MMU85089.
23
Miranda-Vizuete A, Ljung J, Damdimopoulos AE, Gustafsson J-A, Oko R, Pelto-Huikko
M, Spyrou G (2001) Characterization of Sptrx, a novel member of the thioredoxin
family specifically expressed in human spermatozoa. J. Biol. Chem. in press
Miranda-Vizuete A, Spyrou G (2000) Identification of a novel thioredoxin-1 pseudogene
on human chromosome 10. DNA Seq 10: 411-414
Miranda-Vizuete A, Spyrou G (2001) Identification of the first human glutaredoxin
pseudogene localized to human chromosome 20q11.2. DNA Seq. in press
Nachman MW, Crowell SL (2000) Estimate of the mutation rate per nucleotide in
humans. Genetics 156: 297-304
Nakamura H, Nakamura K, Yodoi J (1997) Redox regulation of cellular activation. Annu
Rev Immunol 15: 351-369
Ophir R, Graur D (1997) Patterns and rates of indel evolution in processed pseudogenes
from humans and murids. Gene 205: 191-202
Padilla CA, Bajalica S, Lagercrantz J, Holmgren A (1996) The gene for human
glutaredoxin (glrx) is localized to human chromosome 5q14. Genomics 32: 455-457
Padilla CA, Martínez-Galisteo E, Bárcena JA, Spyrou G, Holmgren A (1995) Purification
from placenta, amino acid sequence, structure comparisons and cDNA cloning of
human glutaredoxin. Eur. J. Biochem. 227: 27-34
Park JB, Levine M (1996) Purification, cloning and expression of dehydroascorbic acidreducing activity from human neutrophils: identification as glutaredoxin. Biochem
J 315: 931-938
Park JB, Levine M (1997) The human glutaredoxin gene: determination of its genomic
organization, transcription start point and promoter analysis. Gene 197: 189-193
24
Pedrajas JR, Kosmidou E, Miranda-Vizuete A, Gustafsson J-A, Wright APH, Spyrou G
(1999) Identification and functional characterization of a novel mitochondrial
thioredoxin system in Saccharomyces cerevisiae. J. Biol. Chem. 274: 6366-6373
Petrov DA, Hartl DL (2000) Pseudogene evolution and natural selection for a compact
genome. J Hered 91: 221-227
Petrov DA, Sangster TA, Johnston JS, Hartl DL, Shaw KL (2000) Evidence for DNA loss
as a determinant of genome size. Science 287: 1060-1062
Saitoh M, Nishitoh H, Fujii M, Takeda K, Tobiume K, Sawada Y, Kawabata M,
Miyazono K, Ichijo H (1998) Mammalian thioredoxin is a direct inhibitor of
apoptosis signal-regulating kinase (ASK) 1. EMBO J. 17: 2596-2606
Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing
phylogenetic trees. Mol Biol Evol 4: 406-425
Stewart EJ, Aslund F, Beckwith J (1998) Disulfide bond formation in the Escherichia coli
cytoplasm: an in vivo role reversal for the thioredoxins. EMBO J. 17: 5543-5550
Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: improving the sensitivity of
progressive multiple sequence alignment through sequence weighting, positionspecific gap penalties and weight matrix choice. Nucleic Acids Res 22: 4673-4680
Tonissen KF, J.R.E. Wells (1991) Isolation and characterization of human thioredoxinencoding genes. Gene 102: 221-228
Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M,
Evans CA, Holt RA, Gocayne JD, Amanatides P, Ballew RM, Huson DH, Wortman
JR, Zhang Q, Kodira CD, Zheng XH, Chen L, Skupski M, Subramanian G, Thomas
PD, Zhang J, Gabor Miklos GL, Nelson C, Broder S, Clark AG, Nadeau J,
25
McKusick VA, Zinder N, Levine AJ, Roberts RJ, Simon M, Slayman C, Hunkapiller
M, Bolanos R, Delcher A, Dew I, Fasulo D, Flanigan M, Florea L, Halpern A,
Hannenhalli S, Kravitz S, Levy S, Mobarry C, Reinert K, Remington K, AbuThreideh J, Beasley E, Biddick K, Bonazzi V, Brandon R, Cargill M,
Chandramouliswaran I, Charlab R, Chaturvedi K, Deng Z, Francesco VD, Dunn P,
Eilbeck K, Evangelista C, Gabrielian AE, Gan W, Ge W, Gong F, Gu Z, Guan P,
Heiman TJ, Higgins ME, Ji RR, Ke Z, Ketchum KA, Lai Z, Lei Y, Li Z, Li J, Liang Y,
Lin X, Lu F, Merkulov GV, Milshina N, Moore HM, Naik AK, Narayan VA,
Neelam B, Nusskern D, Rusch DB, Salzberg S, Shao W, Shue B, Sun J, Wang ZY,
Wang A, Wang X, Wang J, Wei MH, Wides R, Xiao C, Yan C, et al. (2001) The
Sequence of the Human Genome. Science 291: 1304-1351
Wu CT, Morris JR (1999) Transvection and other homology effects. Curr Opin Genet
Dev 9: 237-246
Yang YF, Gan ZR, Wells WW (1989) Cloning and sequencing the cDNA encoding pig
liver thioltransferase. Gene 83: 339-346.
26
LEGENDS TO THE FIGURES
Figure 1. Schematic organization of human Trx1 mRNA and its associated
pseudogenes. The diagram is shown on scale using human Trx1 mRNA GenBank entry
sequence (NM_003329) as module. Alu insertion is the only disablement displayed (see
Table 1 for a complete list of disablement features).
Figure 2. Sequence alignment of human Trx1 ORF with the corresponding
region of human Trx1 pseudogenes. Differences at the nucleotide level are shadowed.
Major disablement features (see Table 1) are boxed. Double vertical lines indicate the
position of the Alu insertion. The sequence of the active site is shown in bold.
Figure 3. Genomic and mRNA organization of human Grx1. The sequences of
the splice acceptor sites are shown for shorter (above exon-3 box) and longer
(underneath exon-3 box) Grx1 mRNA forms. White boxes indicate Grx1 ORF while
shadowed boxes denote Grx1 UTRs.
Figure 4. Sequence alignment of human Grx1 ORF with the corresponding
region of human Grx1 pseudogenes. Differences at the nucleotide level are shadowed.
Major disablement features (see Table 2) are boxed. The sequence of the active site is
shown in bold.
Figure 5. Phylogenetic tree of human Trx1 (A) and human Grx1 (B) genes with
their respective pseudogenes. The scale indicates the number of substitution events per
27
hundred bases. The number in parenthesis shows the estimated age of the pseudogene
expressed as million years (MY) since retrotransposition from the ancestral functional
gene. Percentage bootstrap values (based on 1000 replications) are given on the nodes of
the trees.
28
LEGENDS TO THE TABLES
Table 1. Classification of human Trx1 pseudogenes.
aNumbers
are referred to the nucleotide number in their corresponding GenBank
accession entry. Double inclined bars indicate the position of the Alu insertion.
bAmino
acid residue numbers refer to those of the wild type protein (NM_003329).
Table 2. Classification of human Grx1 pseudogenes.
aNumbers
are referred to the nucleotide number in their corresponding GenBank
accession entry.
bAmino
acid residue numbers refer to those of the wild type protein (NM_002064).
29
Download