TEXT S1: Cytoplasmic Ribosomal Proteins

Martin Helmkampf and Jürgen Gadau
School of Life Sciences, Arizona State University, Tempe, AZ 85287, United States of America
Ribosomes are vital components of the translational machinery that directs
protein synthesis in all cells. In eukaryotes, ribosomes residing in the cytoplasm are
composed of a large (60S) and small (40S) ribosomal subunit, which together comprise
about 80 proteins (CRPs – cytoplasmic ribosomal proteins) and four RNA species
(rRNA) [1]. Although the central processes of protein translation are catalyzed by rRNA,
CRPs fulfill many important roles relevant to ribosome biogenesis, stability and
molecular interaction. These include the facilitation of rRNA folding, protecting the rRNA
from nuclease degradation, mRNA tethering during translation, and serving as a binding
platform for translation factors. CRPs also link the ribosome to cellular signaling
pathways, thus permitting the regulation of translation levels and possibly localized
ribosome recruitment [2]. Many CRPs perform extra-ribosomal functions as well, for
instance in DNA repair, transcription regulation and apoptosis [3]. It has been
hypothesized on the basis of this multifunctionality that ribosomal proteins were coopted from a pre-existing set of proteins during the transformation of the ribosome from
a RNA-only complex to a ribonucleoprotein particle [4]. Homology between a substantial
number of eubacterial, archeal and eukaryotic ribosomal proteins genes further
suggests that this conversion occurred before the divergence of these ancient lineages.
Due to their universally essential role, the basic functional and structural features of
these genes have since been preserved. In eukaryotes, the number and sequence of
CRPs is thus highly conserved, although they can be encoded by a variable number of
Gene models coding for Atta cephalotes CRPs were identified by performing
BLAST searches against the official gene set v1 (OGS1.0) produced by MAKER. CRP
sequences of Drosophila melanogaster, taken from FlyBase (http://flybase.org), served
as query sequences. The obtained gene models were inspected and edited if necessary
using the annotation editor software Apollo [5]. Care was taken to ensure that the
predicted gene structures matched corresponding transcriptomic data. Gene models
were also aligned to homologous protein sequences from D. melanogaster, Apis
mellifera (obtained from the Ribosomal Protein Gene Database,
http://ribosome.med.miyazaki-u.ac.jp), Pogonomyrmex barbatus and Linepithema
humile (both unpublished) using MAFFT v6 ([6], default parameters) to monitor the
integrity of the reading frame and the extent of the predicted coding domains. Gene
homology relations were inferred by querying the annotated D. melanogaster proteins
deposited at FlyBase with the translated gene models. Best reciprocal BLAST hits were
interpreted as orthologs [7]. Pseudogenized gene copies were identified by searching
the A. cephalotes genome assembly using the tblastn program and the D. melanogaster
CRP sequences as queries, with the low complexity filter disabled and the e-value cutoff set to 10–4. The number of CRP genes in Nasonia vitripennis was determined by the
same strategy. Identity scores between protein sequence pairs were computed by
bl2seq, part of the BLAST software package. Nomenclature of the CRP genes follows
Wool et al. [4] and Marygold et al. [8].
In total, we identified 89 genes in the A. cephalotes genome that encode the full
complement of 79 CRPs traditionally recognized in animal genomes. While the majority
of CRPs are represented by single genes, eight are encoded by gene duplicates
(RpL11, RpL14, RpS2, RpS3, RpS7, RpS13, RpS19, RpS28), and one by a gene
triplicate (RpL22). With the exception of RpL14a/b, all multi-copy genes display identical
intron-exon structures and high sequence similarity (95 % on average) between
paralogues, suggesting a recent evolutionary origin. This interpretation is supported by
the fact that the homologous genes are of single-copy status in A. mellifera and N.
vitripennis. In addition, a recent newcomer to the list of ribosomal protein genes, the
receptor of activated C kinase (RACK1) [9], has also been identified. Of the two CRPlike genes present in all eukaryotes [8], only RpL24-like could be found, while RpLP0like seems to have been lost. The corresponding proteins are presumably not
associated with ribosome function, and might not be as essential as proper CRPs
(indeed, loss of CRP-like genes has been reported before, e.g. in Rattus norvegicus). In
contrast to other genomes, neither additional CRP-like genes (characterized by low
sequence similarity to the reference gene), nor processed pseudogenes were
discovered [10]. As in other eukaryotes, RpL40, RpS27A and RpS30 precursors are Cterminally fused to ubiquitin or an ubiquitin-like protein. All genes mentioned above are
supported by EST data, testifying to the high expression levels expected from CRPs
genes. It is conceivable, however, that only one gene of multiple functional copies is
transcribed at a high level, as is generally assumed to be the case in animals [8,10].
Overall, the CRP gene inventory of A. cephalotes is highly similar to that of other
insects, both with regard to gene number (88, 80 and 79 in D. melanogaster, A.
mellifera and N. vitripennis, respectively) and sequence similarity (78 % identity to D.
melanogaster on protein level, range 52–100 %).
1. Taylor DJ, Devkota B, Huang AD, Topf M, Narayanan E, et al. (2009) Comprehensive
Molecular Structure of the Eukaryotic Ribosome. Structure (London, England:
1993) 17: 1591.
2. Brodersen DE, Nissen P (2005) The social life of ribosomal proteins. FEBS Journal
272: 2098.
3. Warner JR, McIntosh KB (2009) How Common Are Extraribosomal Functions of
Ribosomal Proteins? Molecular cell 34: 3.
4. Wool I, Chan Y, Gluck A (1995) Structure and evolution of mammalian ribosomal
proteins. Biochemistry and Cell Biology 73: 933–947.
5. Lewis SE, Searle SMJ, Harris N, Gibson M, Iyer V, et al. (2002) Apollo: a sequence
annotation editor. Genome Biology 3: research0082.0081 - 0082.0014.
6. Katoh K, Misawa K, Kuma Ki, Miyata T (2002) MAFFT: a novel method for rapid
multiple sequence alignment based on fast Fourier transform. Nucleic Acids
Research 30: 3059.
7. Wall DP, Fraser HB, Hirsh AE (2003) Detecting putative orthologs. Bioinformatics 19:
8. Marygold S, Roote J, Reuter G, Lambertsson A, Ashburner M, et al. (2007) The
ribosomal protein genes and Minute loci of Drosophila melanogaster. Genome
Biology 8: R216.
9. Sengupta J, Nilsson J, Gursky R, Spahn CMT, Nissen P, et al. (2004) Identification of
the versatile scaffold protein RACK1 on the eukaryotic ribosome by cryo-EM. Nat
Struct Mol Biol 11: 957.
10. Zhang Z, Harrison P, Gerstein M (2002) Identification and Analysis of Over 2000
Ribosomal Protein Pseudogenes in the Human Genome. Genome Research 12: