Group1 - OpenWetWare

advertisement
Site-specific Incorporation of trans-4-hydroxyproline in Recombinant Collagens and
Gelatins
Doug Brownfield, Emily Perttu, Eddie Wang, and James Zhang
Introduction:
In order for synthetic biology systems to better mimic biology, methods for posttranslational modification need to be developed. To reach their final form, many eukaryotic
proteins are post-translationally modified at specific amino acid residues. Studying these
proteins by expression in microbial hosts which generally lack the capabilities to make these
additions can lead to products with diminished or no function. Furthermore, such studies provide
no insight into the purpose of the modifications. Therefore it is in synthetic biology’s interest to
have the tools necessary for creating these modifications. The goal of our project is to mimic the
action of one such modification in E. coli, the conversion of proline residues to trans-4hydroxyproline residues. As an initial target protein, we will attempt to express collagens, as
they are the best studied and arguably most important class of proteins that contain
hydroxyproline. We hope to ensure high efficiency, fidelity, and yield of our product by
designing a genetic regulatory pathway.
Idea Overview:
Our objective is to site-specifically incorporate the non-canonical amino acid, trans-4hydroxyproline into collagen for expression in E. coli. We will accomplish this by engineering
an amber suppressor prolyl tRNA and cognate hydroxyprolyl tRNA aminoacyl synthetase. The
aminoacyl synthetase will specifically charge the prolyl tRNA with hydroxyproline and, in
response to amber stop codons, this charged tRNA will be read by the ribosome, thereby
incorporating hydroxyproline. The trans-4-hydroxyproline residues will be continually supplied
via introduction of an enzyme which catalyzes the transformation of proline to hydroxyproline.
Finally, gene translation will be regulated by introduction of an engineered hydroxyproline
responsive riboregulator. The resulting E. coli will have what amounts to a 21 amino acid
genetic code with all the cellular machinery necessary for efficient collagen expression.
Background:
Collagen is the most abundant protein that comprises mammalian organisms, constituting
30% of a human’s protein mass. Serving as a scaffold, collagen is used by cells to mold their
surroundings, eventually cultivating an environment conducive to cellular functionalization and
tissue development. Besides mechanical support, collagen contains various ligands for growth
factor receptors and integrins that can influence such cellular actions as cell adhesion,
chemotaxis/migration, tissue remodeling, and wound healing.
By definition, collagen molecules consist of three polypeptide chains (α chains) and have
at least one domain composed of repeating Gly-X-Y sequences in each of the constituent chains
(Myllyharju and Kivirikko, 2004). Currently, vertebrates have at least 27 collagen types with 42
distinct α chains. Some collagens form homotrimers with the three α chains while others contain
two or even three different α chains. The X and Y positions can have any amino acid other than
glycine, but typically proline is found in the X position and 4-hydroxyproline in the Y
position. While 4-hydroxyprolines are essential for the stability of the triple helix, glycines are
necessary for packing the three chains into a coiled-coil structure. This structure is characterized
as a left-handed helix which is then wound around a common axis to form a triple helix with a
shallow right-handed superhelical pitch, making the final structure a rope-like rod.
Collagen Types
To avoid confusion, collagens are numbered with roman numerals in the order of their
discovery (types I-XXVII). When referring to a collagen’s composition, each of the three α
chains are first numbered for chain number (1,2, or 3) then the collagen type is given in
parentheses. For example, α2(I) means the second α chain is type I while α1(II) means the first α
chain is type II collagen.
Division of collagen types into families is made mainly by the mechanism and structure
of matrix assembly. The nine collagen families with their relative types are: fibril-forming (I, II,
III, V, XI, XXIV and XXVII), fibril-associated collagens with interrupted triple helices
(FACITs) located on the surface of fibrils (IX, XII, XIV, XVI, XIX, XX, XXI, XXII and XXVI),
hexagonal forming (VIII and X), basement membrane forming (IV), beaded filaments (VI),
Anchoring fibrils for basement membranes (VII), transmembrane domains (XIII, XVII, XXIII
and XXV), and the family of type XV and XVIII collagens.
Certain collagens are expressed in a tissue specific manner, such as the types II, IX and
XI that are found almost exclusively in cartilage, while type XVII is only found in skin
hemidesmosomes. On the other hand, some collagen types are common in most extracellular
matrices, such as type I. Moreover, collagen fibrils often consist of more than one collagen type,
such as type I collagen fibrils that often contain small amounts of types III,Vand XII. Further
heterogeneity within the superfamily results from alternative splicing of the transcripts of many
of the genes as well as the use of alternative promoters in some genes. By the large number of
structurally distinct members of the superfamily implies that they are involved in numerous
biological functions (Kadler, 1995).
Collagen assembly
The majority of collagens share a similar formulation process that’s typically associated
with type I. Starting inside the cell, three peptide chains are formed in ribosomes along the
Rough Endoplasmic Reticulum (RER). These peptide chains are referred to as preprocollagens
and each have registration peptides (on the end) as well as a signal peptide. These peptide
chains are then sent into the lumen of the RER where they are cleaved into their procollagen
forms.
While still in the RER, these peptide chains proceed to undergo a series of functional
changes. First, the lysine and proline amino acids are hydroxylated, a process dependent on
ascorbic acid (Vitamin C). Next, specific hydroxylated amino acids are glycosylated, allowing
the three chains to associate into a triple helical structure. Finally, the procollagen is shipped to
the golgi apparatus where it is packaged and secreted by exocytosis.
Once outside the cell, the collagen is again organized into a functional
matrix. Registration peptides are cleaved via procollagen peptidase, forming tropocollagen,
which can self-aggregate to form collagen fibrils, which also self-aggregate to form into collagen
fibers. For non-fibrillar collagen, the N- and C-propeptides remain and may play a critical role in
directing supramolecular assembly. After fiber formation, interchain crosslinking of collagen
occurs between hydroxylysine and lysine residues following deamination from lysyl oxidase
(Yamauchi and Shiiba 2002).
Prolyl 4-Hydroxylase (P4H)
As previously mentioned, hydroxylation of the Y-position proline residues is a critical
modification for generating stable triple helical collagen. This modification is carried out in the
lumen of the RER by the enzyme prolyl 4-hydroxylase (Tandon, 1998). The vertebrate forms of
these P4H’s are α2 β2 tetramers in which the β subunit is identical to the protein disulfide
isomerase PDI (Myllyharju, 2003). Various isoforms of the catalytic a subunit have been found
in organisms of varying size and complexity; from humans to Drosophila (Vuori et al., 1992;
Annunen et al., 1999).
Another family of P4H’s in the cytoplasm has been uncovered and has been linked to the
regulation of the hypoxia-inducible transcription factor HIF (Ivan, 2001). Cytoplasmic P4H’s
have no PDI subunit, require different sequences flanking the prolines that are hydroxylated, and
have markedly higher Km values (Kivirikko and Myllyharju, 1998). No overall amino acid
sequence homology is detected between the collagen and the cytoplasmic HIF P4H’s, with the
exception of critical catalytic residues. HIF is continuously synthesized and under normoxic
conditions a critical proline residue in a -Leu-X-X-Leu-Ala-Prosequence is hydroxylated by the
cytoplasmic P4H’s, not by collagen P4Hs. The resulting 4-hydroxyproline residue is essential
for HIFα binding to the von Hippel–Lindau (VHL) E3 ubiquitin ligase complex for subsequent
proteasomal degradation. However, under hypoxic conditions hydroxylation ceases, allowing
HIFα to escape degradation and instead forms a stable dimer with HIFβ (Jaakkola, 2001). Once
formed, the dimer is translocated into the nucleus and becomes bound to the HIF-responsive
elements in a number of hypoxia-inducible genes, such as those for erythropoietin, vascular
endothelial growth factor, glycolytic enzymes and even for the α(I) subunit of human type I
collagen (Takahashi, 2002).
Applications
Collagen has been widely used in cosmetic surgery, hemostats, device coatings,
resuscitation fluids, formulation excipients, capsules, cartilage reconstruction, drug delivery, as
wells as skin substitutes for burn patients. However, both medical and cosmetic use is declining
because most commercially available collagens are derived from bovine or porcine
tissues. Mainly enriched in type I collagen, these preparations also contain small amounts of
type III as well as other collagens that are difficult and expensive to remove from the desired
material. Moreover, there is a high rate of allergic reactions from animal-derived collagens,
causing prolonged redness. Using collagen derived from cows also poses the risk of transmitting
prion diseases such as bovine spongiform encephalopathy (BSE). The scientific community also
uses collagen in its studying its role in tissue development and disease. Extracting sufficient
quantities of nontraditional or less prominent collagens is a costly and difficult task.
A processed form of collagen commonly used is gelatin. Derived from denatured
collagen, gelatin is composed of a mixture of collagen chains of different length, structure, and
composition. This distribution depends on what type(s) of collagens are extracted, the extraction
method, as well as the pH and ionic strength of the solution used for processing. Because gelatin
is a heterogeneous composition, especially in size and isoelectric point, the resulting products
will inevitably have variable gelling and physical properties (Olsen, 2005). This variability
presents a significant challenge for medical applications where stability, safety, and control are
necessary.
Cheaply produced recombinant collagens and gelatins have the potential to alleviate
many of the issues associated with animal derived versions. Given the large number of
aforementioned applications there is also a large market in this area. Scalable technology is
needed to make microbial expression of recombinant collagens a viable alternative to tissue
extraction. Using microbes to engineer collagen allows for greater control over collagen
synthesis and organization, which in turn increases the quality, consistency, and safety of
collagen production. It would also provide an easy platform for introducing altered primary
sequences into recombinant collagens. Such genetic control over collagen structure is crucial in
studying the impact of specific mutations on collagen structural hierarchical assembly and
associated functions and also would allow for the creation of designer collagen-mimetic
materials. Recombinant expression would also allow for the extraction of sufficient quantities of
native collagen forms that are present at low levels which are otherwise mainly characterized at
cDNA and genomic levels. This would allow for structural and functional analysis of these rarer
collagens.
Biomaterials applications for collagens in hemostats, as skin substitutes, in cartilage
reconstruction, and for drug delivery can benefit from the improved purity of cloned sources of
collagen. Purity in this case would include both reducing other extracellular matrix components
that may be carried through the purification process leading to potential inflammatory responses,
or bioburdens with potential impact on human heath, particularly neurological disorders due to
prion concerns. Recombinant human collagen seems to avoid immune reactions previously
described and is therefore more biocompatible. Recombinantly derived collagen was shown to
have superior mechanical strength and hemostatic activity compared to animal derived collagen
when formed into a matrix. They can be altered to include bioactive peptide sequences as well
as to be collagenase resistant.
Recombinant gelatins can be tailored to alter their gelling temperature by controlling their
hydroxyproline content. Moreover, they have been shown to be less allergenic. As they are
widely used in the food and drug industry, recombinantly derived gelatins can be made animalfree and thus open for consumption by vegetarians (Baez, 2005).
Past Work
Besides tissue extraction, nonmicrobial/bacterial systems have also been developed for
producing recombinant collagens. However, the current productivity, quality, and costs of these
nonmicrobial systems are not attractive for commercial applications (Baez, 2005). Transfected
mammalian cells with human collagen genes were first used for collagen production and is the
most efficient system for expressing properly prolyl hydroxylated full-length collagen that also
gets secreted (Ala-Kokko et al. 1991). Expression of collagen genes in insect cells yielded
unstable non-hydroxylated collagen, but yielded hydroxylated versions when coexpressed with
the human P4H gene (Lamberg, 1996). However they fail to secrete it and instead accumulate
the product intracellularly. Other sources have included milk from transgenic animals, secretions
from transgenic silkworms, and transgenic plants. Most of these methods have not been able to
obtain the same degree of prolyl hydroxylation as in native human collagen. Moreover, cost and
productivity have not made them commercially viable.
Two recombinant systems using lower level organisms have been used in the production
of stable triple-helical human collagens; yeast and E.coli. For yeast, collagen fragments are
secreted as single-chain polypeptides via the yeast alpha-mating factor pre-pro sequence, but
secretion of full-length triple-helical procollagen, as seen in normal mammalian collagen
synthesis, has not been achieved. In contrast to mammalian expression, the trimerization of
collagen polypeptides has an inhibitory effect on secretion, leading to intracellular accumulation
despite the presence of the alpha-mating factor pre-pro secretory sequence. The most successful
work has come from coexpression of collagen with P4H in Pichia pastoris where scientists have
successfully generated 1-1.5g/L of collagen type I, II, and III with hydroxylation levels being at
or near native levels. The products also showed the proper thermal stability and morphology as
native collagen (Baez, 2005). In Saccharomyces cerevisiae, collagen type I was generated with
82% of native hydroxylation levels. {{78 Toman,P.David 2000; }}.
E. coli has seen limited success in collagen production. Small fragments (93-245 AA) of
bovine collagen α2 (I) chain have been expressed (Hori et al. 2002). While accumulation levels
or purification yields aren't known, enough material was produced for identification via antibody
staining. Another group used E. coli to produce a totally synthetic gelatin made solely of 32
repeats of Gly-Pro-Pro (Goldberg et al. 1989). While properly constructed, the gelatin was
shown to accumulate in inclusion bodies. Furthermore, inhibition of the heat shock (HIF related)
response of E. coli significantly stabilized the expressed synthetic gelatin product. For the most
part, success in expressing collagen in E. coli has been limited by low yields mainly attributed to
the apparent instability of these highly repetitive genes in E. coli (Cappello, 1990). In attempt to
circumvent this obstacle, an E. coli strain was engineered for the cotranslational incorporation of
hydroxyproline into various lengths of type I collagen (Buechter, 2003). This was achieved by
growing an E. coli culture engineered for increased prolyl aminoacyl-tRNA synthase
accumulation in a hyperosmotic media supplemented with hydroxyproline. However, the
resulting α1(I) collagen fragment was different from tissue-derived collagen in that
hydroxyproline was present at both X and Y positions of the Gly-X-Y triplets. Interestingly,
collagen fragments of this variant were still assembled into triple helices.
Collagen and P4H coexpression in E. Coli has had setbacks because a pair of essential
disulfide bonds in the P4H β subunits are not formed in the cytoplasm of E. coli. However, it
has also been demonstrated that E.coli can produce properly folded human collagen P4H in the
periplasm (Neubauer, 2007). Also, a novel mutant E. coli strains with a more oxidizing
cytoplasm has recently been developed which have successfully expressed proteins with up to 17
disulfide bonds. Using these strains recent work has used E. coli to produce large amounts of a
recombinant human collagen P4H in the cytoplasm which resulted in higher amounts of the
active tetramer.
The major current limitation of these previous methods is that coexpression of P4H genes
with collagen genes still leaves the hydroxylation process to be at the whim of P4H's
sequence/structural specificity. That is, scientists have no power to specify the exact location of
hydroxyproline residues. This is a hindrance to the design of novel collagen based materials and
to any studies that might want to predictably alter the hydroxyproline content. It may also be the
case that not all collagen proteins created become hydroxylated to the same degree using these
methods, leading to a less homogenous product.
Methods
Site-specific incorporation of hydroxyproline
In order to site-specifically incorporate hydroxyproline, we will use methods pioneered
by Peter Schultz’s lab at Scripps. These methods provide a means to genetically encode the
location of unnatural amino acids using the amber stop codon (TAG). The procedure will
involve three main steps: 1. Generating an orthogonal prolyl tRNA/prolyl tRNA aminoacyl
synthetase pair (tRNA[pro]/aaSyn). 2. Engineering the synthetase to acylate hydroxyproline
(Wang, 2006). 3. Optimization.
A successful starting point for generating tRNA/aaSyn pairs has been to look to archaeal
sources. Many archael tRNAs are not substrates for eubacterial amino-acyl synthetases and
archael tRNAs and synthetases express efficiently in E. coli. Moreover, there is an everincreasing amount of archaeal sequence and structural data coming out.{{73 Santoro, 2003; }}
Orthogonal pairs for leucine, glutamate, tyrosine, and lysine have already been created using
archael sources (Wang, 2006). We will attempt to use the tRNA[pro]/aaSyn pair from
Methanocaldococcus janaschii because it is well studied and has available structural data.
(PDB:1NJ8). The pair is very likely to be orthogonal because prolyl aaSyn from M. janaschii
has close homology to eukaryotic Proly aaSyns and is actually in a different class of aaSyns than
E. coli’s. Also, several nucleotides important in E.Coli tRNA[pro] recognition are different in
tRNA[pro] from M.janaschii (Burke, 2001).
We will begin by changing the anticodon of M.janaschii tRNA[pro] to the amber stop
codon and expressing it in E.coli (tRNA[proA]). Previous orthogonal pairs have relied on the
aaSyn to be unaffected by changes in the anticodon region. However, in the case of prolyl
aaSyn, the anticodon region on its cognate tRNA is known to be important for recognition
(Burke, 2001). Therefore, we will use directed evolution on what is known to be the enzyme’s
anticodon recognition region to alter its specificity to recognize the amber stop codon. A library
of aaSyns will be created and assayed for activity by coexpression with a fluorescent marker
bearing amber stop codons. The library members that best suppress the amber stop codon can be
isolated using FACS (Fig. 1). At this point we can proceed using the same methods as the
Schultz group. Briefly, this involves negative selection on a library of tRNA[proA], followed by
a round of positive selection to yield a tRNA[proA] that is completely orthogonal to the rest of
E.coli’s tRNA (Wang, 2006).
The next step is altering the substrate specificity of the prolyl aaSyn. This involves
creating a library of aaSyn. At this point it is helpful to have the crystal structure of the aaSyn,
because this allows for specific directed evolution on the amino acid binding region. This library
is put through a round of positive selection followed by negative selection, ultimately yielding an
aaSyn that can charge hydroxyproline to its cognate tRNA (Fig. 2,3) (Wang, 2006).
The final challenge for site-specific incorporation is optimization. For the most part,
studies using artificial amino acids have involved inserting at one site in a specific product.
However, recombinant collagen will require several hydroxyprolines to be inserted per chain.
Inefficiency will result in poor yield and truncation products. Recently Ryu et. al. have shown
that optimizing their system could lead to incorporation of an unnatural amino acid at levels
approaching that of natural amino acids (Ryu, 2006). We will no doubt have to perform similar
work to ensure incorporation of hydroxyproline with high efficiency and fidelity.
Generation of hydroxyproline
An effective system for site-directed incorporation of hydroxyproline will require
sufficiently high concentrations of hydroxyproline, a situation uncommon in natural systems.
Traditionally, unnatural amino acids are supplemented to the medium, however, insufficient
uptake leads to truncation products (Liu, 2006). It has already been shown that hydroxyproline is
not efficiently transported into the cytosol (Buechter, 2003). Previously the Schultz group added
a pathway for the production of the non-canonical amino acid, p-amino phenylalanine, to E.coli.
We are proposing a similar strategy for generating the hydroxyproline in vivo using a nonmammailian P4H, thereby alleviating the issue of poor uptake (Mehl, 2003).
The majority of proline-4-hyrdoxlases act on peptidyl proline exclusively and in a
sequence specific manner. However, several studies have found P4H’s which also hydroxylate
free L-proline (Petersen 2003, Lawrence 1996, Bontoux 2006). In fact, free hydroxyproline is
used as a precursor for a variety of secondary metabolites such as in etamycin synthesis in
Streptomyces griseoviridus P8648 (Lawrence 1996). The P4H to be used in this study is derived
from a sequence cloned from Dactylosporangium sp. Previous work has shown that, when
cloned into E. coli, this P4H exhibits a 1600-fold increase in activity relative to its native host
and environment (Shibasaki 2000). Additionally, several non-protein factors are required by
P4H. P4H is a 2-oxoacid ferrous dependent dioxygenase (Lawrence 1996). Therefore, to
promote efficient hydroxylation, 2-oxoglutarate and Fe2+ will be provided in the culture media
for cellular uptake. Ascorbate will also be supplemented in the culture media. The P4H gene
will be introduced into a high copy number plasmid with a pMB1 origin of replication.
Shibasaki et. al. showed that the hydroxyproline output could be tuned in several ways.
This includes addition of L-proline, feedback resistant mutations for increased proline
biosynthesis, and mutations in proline degradation enzymes (Shibasaki 2000). Ultimately, the
relative concentrations of proline and hydroxyproline will need to be controlled and optimized
for efficient collagen synthesis.
Gene regulation
Typically, to express a protein, the cells are grown to a certain density and then induced
to start pumping out the product. However, our product relies heavily on the availability of
hydroxyproline. Therefore, to maximize efficiency of collagen production, we require our
pathway to be regulated such that the expression of collagen occurs only when there are
sufficient concentrations of hydroxyproline in the host. Otherwise translation of collagen will
halt at the amber stop codons due to a lack of hydroxyproline activated tRNA. This implies that
we need to engineer a genetic control element that is highly sensitive to free hydroxyproline. We
propose to construct a custom riboswitch that selectively binds to hydroxyproline, and induces
genes under its control when this binding occurs.
Riboswitches can be found in the 5’ untranslated region of the mRNA under regulation,
just before the ribosome binding site (RBS). Most riboswitches consists of two structural
domains: an aptamer and an expression platform (Tucker, 2005). The aptamer domain is highly
folded and binds specifically to a target molecule. Upon binding, the RNA undergoes structural
changes and the expression platform either exposes the RBS or hides it, thus facilitating
translation or inhibiting it. There are many classes of riboswitches characterized by differences
in the aptamer and expression platforms. One riboswitch discovered at the Breaker lab at Yale,
gcvT, is particularly well-suited for our purposes.
The gcvT motif is found in many bacterial species, including B. subtilis and V. cholerae,
and resides upstream of genes that participate in the glycine cleavage pathway. The gcvT operon
is rare because it utilizes ligand binding to activate gene expression, whereas most other
riboswitches are used to repress gene activity. The ability to activate the gene in the presence of
ligand is exactly the feature we desire in our system. Also, gcvT selectively binds to a very
small molecule, glycine, which is composed of only 10 atoms. Our target molecule,
hydroxyproline is also a small molecule therefore gcvT can serve as a very good starting
template for our engineered riboswitch. Furthermore, unlike other ribozymes, gcvT has two
aptamer domains, type I and type II, with a highly conserved linker sequence in
between. Experiments have shown that these two domains cooperatively bind to glycine with a
Hill coefficient of between 1.4 and 1.6 (Mandel, 2004). This gives us a new mechanism to tune
the sensitivity of this riboswitch. With only one aptamer domain, gcvT went from 10% to 90%
ligand bound more than a 100-fold increase in glycine concentration. With two aptamer
domains, however, the same change in ligand binding occurs with only a 10-fold increase in
glycine concentration.
We propose to use gcvT riboswitch as the starting template for our hydroxyproline gene
switch. The template will be computationally redesigned to bind to hydroxyproline instead of
glycine. RNA structural prediction and folding is well studied and previous work has
demonstrated the feasibility of computational design of ribozymes with desired function
(Penchovsky, 2005) To further enhance the specificity and binding affinities of the
hydroxyproline apatamers, we will use directed evolution via SELEX (systematic evolution of
ligands by exponential enrichment) (Ellington 1990) on the most promising candidates as
determined from the computational step. We will test the function of these hydroxyproline
riboswitches in vivo by inserting them at the 5’ end of GFP mRNA, and induce GFP translation
with addition of hydroxyproline into the medium.
The hydroxyproline responsive riboswitch will be ultimately incorporated upstream of
the RBS of our engineered collagen genes. The flexibility in our riboswitches allows us to fine
tune the activity of different genes by using slightly different riboswitches with different
hydroxyproline sensitivities. Our goal is to design the control system such that we can create
and maintain a steady state production of collagen. This will likely involve extensive testing to
determine the concentration of hydroxyproline and methods to alter the degree of induction.
Collagen Secretion/Purification
Ideally we will be able to secrete our product in order to ease purification. It may be
possible to achieve this by fusion with a signal sequence such as pelB that has been shown to
facilitate the secretion of the attached protein (Sletta 2007, Xuyang 1995). Xuyang, et al.
demonstrated the ability of the pelB sequence to direct peptides to the extracellular medium
through the secA pathway, achieving a total protein production of 2.2 g/L, half of which was in
the soluble form. A second comparison study by Sletta, et al replaced the pelB sequence with
another naturally occurring secretion signal, ompA. They observed that for certain proteins,
when coupling to pelB did not result in significant transport, ompA did. Therefore, in our study,
both secretory signals will be tested as it is difficult to predict a priori which, if any, will be most
effective in driving the secretion of collagen.
Protein secretion in E. coli also involves the help of chaperone proteins. Because the
level of secretion required in the proposed system is much greater than under normal conditions,
additional chaperones will be necessary. To account for this, sequences encoding for additional
secB, Dnak, and DnaJ will also be included (Baneyx 1999) on a ColE1-based plasmid with a
copy number that can be regulated depending on the initial secretion results
Collection and purification of the collagen will begin with the isolation of the soluble
collagen from the culture media. E. coli does not secrete natural proteins in high volumes and
therefore the soluble collagen should have few proteinaceous contaminants. The soluble collagen
will be collected through gentle centrifugation. Next, the cells enclosing the remaining collagen
will be lysed and the cell fragments will be removed by centrifugation. The ionic content of the
collagen-containing solution will then be increased, causing the collagen to precipitate. The
precipitate will be then be taken up into an acidic solution in which the collagen is
soluble. These two steps will remove proteins that do not precipitate at high ionic strengths first
and then those that do not re-solubilize in acidic conditions. Finally, both the soluble collagen
collected early on, and the isolated cell-based collagen will be purified through ion exchange
chromatography.
Analysis and System Characterization
Several aspects of the system require analysis in order to maximize overall collagen
synthesis. First, the P4H concentration and enzymatic activity will have to be measured. The
P4H levels in the cells can be determined by fusion of GFP with the P4H coding sequence. The
amount of P4H can be detected by fluorescence. The activity of the enzyme will be determined
by the conversion of L-proline to 4-trans-L-hydroxyproline as measured by HPLC. The
measurements will be made for a range of cofactor and cosubstrate concentrations (Fe2+, 2oxoglutarate, ascorbate).
Next, because this system’s claimed benefit is the production of collagen with low
polydispersities and highly uniform placement of hydroxyproline, the structure of collagen must
be carefully analyzed. SDS-PAGE will be used to ensure single product formation. GPC will be
used to determine the polydispersity and molecular weight of the collagen. Individual strands
can be sequenced through Edman degradation techniques to confirm the incorporation of the
hydroxyprolines. Overall yields, including the ratio of secreted to non-secreted collagen will also
be determined based on the purified masses. Finally, TEM images can be used to determine the
morphology of the collagen fibrils.
Issues and Troubleshooting
There are several aspects of the proposed system which may yield insufficient results and
may require alterations. Generating orthogonal tRNA[pro]/aaSyn pairs can lead to dead ends
(Wang, 2006). In that case, we can attempt to use non-archael sources such as yeast, or attempt
to use previously generated orthogonal pairs by engineering their aaSyn substrate specificities for
hydroxyproline. In terms of purification it is unlikely that a large majority of the collagen will be
secreted especially if it is of high molecular weight. Although several studies have reported
successful secretion mechanisms, there is not a ubiquitous pathway and most of the reported
successes were specific to protein type. We hope that one of the previously studied signaling
sequences will facilitate the secretion of a large portion of the collagen; however, if this is not
observed it is possible to relocate our system into a different organism such as yeast. Secretion
of recombinant proteins from yeast has been more successful that from E. coli. In fact, Julio
Baez et al have reported secretion of collagen from yeast in amounts ranging from 3-14 g/L.
Therefore, yeast cells offer a realistic alternative for overcoming low secretion in E. coli.
Ultimately, the most daunting task will be fine-tuning the flux through each pathway to
ensure full-length collagen production. For example, designing concentration dependence of the
hydroxyproline responsive riboregulator will depend highly on determining the turnover
efficiency of P4H. It will also be dependent on determining the concentration of hydroxyproline
necessary for efficient amber suppression. This interdependence means that this project will
require a great deal of iterative alteration and optimization.
Timeframe
We believe that our initial goal of site-specific introduction of hydroxyproline into
collagen will take two years. However, we believe it will take at least two more years to further
optimize the system and possibly make it commercially viable. Figure 4 displays a proposed
timeline. Given a large enough team, generation of the orthogonal pair, design of the
hydroxyproline responsive riboregulator, characterization of P4H activity, and secretion testing
could be done in parallel in about 1.5 years. We believe another half year would be necessary to
combine the systems for low level, low efficiency production of collagen. The following two
years would be spent optimizing the system, mainly for increased yield. Finally, an additional
year would be necessary for scaling up the production process.
Ethical/Social Impacts
We do not foresee any significant ethical issues with our project besides the usual
concerns about genetic engineering and synthetic biology in general. The number one concern
that we must address is that of safety. The host organism we propose to use, E. coli, is well
studied, widely used, and poses minimal risk. However, to use our engineered collagen in
humans for medical purposes will require the necessary FDA approvals. In fact, our bacteriaproduced collagen is probably safer than the current animal derived alternatives. This is because
extracting and using bovine collagen poses risk of transmitting prion diseases such as bovine
spongiform encephalopathy (BSE). Using bacteria to produce quantities of human collagen has
ethical advantages as well. Our product will mostly replace animal derived collagen, which is
currently being opposed by animal rights activist organizations such as PETA.
The ability to cheaply produce medical grade collagen will have significant impact on
society. It is predicted that the demand for collagen-based biomaterials will continue to rise,
mainly due to the aging of the baby boomer generation. Frost & Sullivan reported that in 2001
the total US market for collagen-based biomaterials generated over $70 million in revenue. This
is projected to grow beyond $91.8 million in 2008 (Murrieta, 2002). Our technology of
producing cheap high quality collagen should capture significant shares of this growing market.
Conclusion
In summary we believe that successful completion of our proposal will lead to facile,
high yield production of monodisperse collagens and gelatins containing site-specific
incorporation of hydroxyproline. This will allow scientists to study the structure and assembly
of collagen and the importance of hydroxyproline at a previously unachievable level.
Additionally, this may provide a new source for animal free collagens and gelatins.
References
Ala-Kokko L, Hyland J, Smith C, Kivirikko KI, Jimenez SA, Prockop DJ. Expression of a
human cartilage procollagen gene (COL2A1) in mouse 3T3 cells. J Biol Chem. (1991) 266:
14175–14178.
Annunen, P, Koivunen P, Kivirikko KI. Cloning of the α subunit of prolyl 4-hydroxylase from
Drosophila and expression and characterization of the corresponding enzyme tetramer with some
unique properties. J. Biol. Chem. (1999) 274: 6790–6796.
Baez, et. al. Recombinant microbial systems for the production of human collagen and gelatin.
Applied microbiology and biotechnology 69, 245 (2005).
Baneyx, Francois. Current Opinnion in Biotechnology (1999) Vol.10, Issue 5 411-421.
Bontoux, M.-C. et al. Tetrahedron Letters 47 (2006) 9073-9076.
Buechter, D. D. et al. Co-translational Incorporation of Trans-4-Hydroxyproline into
Recombinant Proteins in Bacteria. J. Biol. Chem. 278, 645-650 (2003).
Burke, et. al. Divergent adaptation of tRNA recognition by Methanococcus jannaschii prolyltRNA synthetase. The Journal of biological chemistry 276, 20286 (2001).
Cappello J, Crissman J, Dorman M, Mikolajczak M, Textor G, Marquet M, Ferrari F. Genetic
engineering of structural protein polymers. Biotechnol Prog. (1990) 6: 198–202.
Ellington, A. D. & Szostak, J. W. Nature (1990) 346, 818-822
Goldberg I, Salerno AJ, Patterson T, Williams JI. Cloning and expression of a collagen-analogencoding synthetic gene in Escherichia coli. Gene. (1989) 80: 305–314.
Hori H, Hattori S, Inouye S, Kimura A, Irie S, Miyazawa H, Sakaguchi M. Analysis of the major
epitope of the alpha2 chain of bovine type I collagen in children with bovine gelatin allergy. J
Allergy Clin Immunol. (2002) 110: 652–657.
Ivan M, Kondo K, Yang H. HIFα targeted for VHL-mediated destruction by proline
hydroxylation: implications for O2 sensing. Science. (2001) 292: 464–468.
Jaakkola P, Mole DR, Tian YM. Targeting of HIFα to the von Hippel–Lindau ubiquitylation
complex by O2-regulated prolyl hydroxylation. Science. (2001) 292: 468–472.
Kadler K. Extracellular matrix 1: fibril-forming collagens. Protein Profile (1995) 2: 491–619.
Kivirikko KI, Myllyharju J. Prolyl 4-hydroxylases and their protein disulfide isomerase subunit.
Matrix Biol. (1998) 16: 357–368.
Lamberg, A. et al. Characterization of Human Type III Collagen Expressed in a Baculovirus
System. J. Biol. Chem. 271, 11988-11995 (1996).
Lawrence, Christopher et al. Biochem J. (1996) 313, 185-193.
Liu. Recombinant expression of selectively sulfated proteins in Escherichia coli. Nature
biotechnology 24, 1436 (2006).
Mandel, M. et al. Science 306, 275-279.
Mehl, R. A. et al. Generation of a Bacterium with a 21 Amino Acid Genetic Code. J. Am. Chem.
Soc. 125, 935-939 (2003).
Murrieta, T. Baby boomers drive demand for collagen-based biomaterials. Health & Medicine
Week, July 1, 2002 pp 16.
Myllyharju J, Kivirikko K. Collagens, modifying enzymes and their mutations in humans, flies
and worms. TRENDS in Genetics. (2004) 20: 33-43.
Myllyharju, J. Prolyl 4-hydroxylases, the key enzymes of collagen biosynthesis. Matrix Biol.
(2003) 22: 15–24.
Neubauer A, Soini J, Bollok M, Zenker M, Sandqvist J, Myllyharju J, Neubauer P. Fermentation
process for tetrameric human collagen prolyl 4-hydroxylase in Escherichia coli: Improvement by
gene optimisation of the PDI/β subunit and repeated addition of the inducer anhydrotetracycline.
Journal of Biotechnology. (2007) 128: 308–321.
Penchovsky, R. & Breaker, R.R. Computational design and experimental validation of
oligonucleotide-sensing allosteric ribozymes. Nature Biotechnology (2005) Vol. 23, No. 11 pp.
1424-1433.
Petersen, L. et al. Appl. Microbiol. Biotechnol. (2003) 62:263-267.
Ryu, . Efficient incorporation of unnatural amino acids into proteins in Escherichia coli. Nature
methods 3, 263 (2006).
Santoro, . An archaebacteria-derived glutamyl-tRNA synthetase and tRNA pair for unnatural
amino acid mutagenesis of proteins in Escherichia coli. Nucleic Acids Research 31, 6700 (2003).
Sletta, H. et al. Applied and Environmental Microbiology Feb. 2007 Vol. 73, No. 3 pp. 906-912.
Takahashi Y, Takahashi S, Shiga Y, Yoshimi T, Miura T. Hypoxic induction of prolyl 4hydroxylase α(I) in cultured cells. J. Biol. Chem. (2002) 275: 14139–14146.
Tandon M, Wu M, Begley TP, Myllyharju J, Pirskanen A, Kivirikko K. Substrate specificity of
human prolyl-4-hydroxylase. Bioorg Med Chem Lett. (1998) 8: 1139–1144.
Toman, P. D. et al. Production of Recombinant Human Type I Procollagen Trimers Using a
Four-gene Expression System in the Yeast Saccharomyces cerevisiae. J. Biol. Chem. 275,
23303-23309 (2000).
Tucker, B.J. & Breaker, R.R. Riboswitches as versatile gene control elements. Curr Opinions in
Structural Biology (2005) 15:342-348
Vuori K., Pihlajaniemi T. Marttila M., Kivirikko KI. Characterization of the human prolyl 4hydroxylase tetramer and its multifunctional protein disulfide-isomerase subunit synthesized in a
baculovirus expression system. Proc. Natl. Acad. Sci. (1992) 89: 7467–7470.
Wang, L., Xie, J. & Schultz, P. G. EXPANDING THE GENETIC CODE. Annu. Rev. Biophys.
Biomol. Struct. 35, 225-249 (2006).
Xuyang Li et al. Applied and Environmental Microbiology July 1995 Vol. 61, No. 7 pp. 26702680.
Yamauchi M, Shiiba M. Lysine hydroxylation and crosslinking of collagen. Methods Mol Biol
(2002) 194: 290.
Figure 1: A library of proline aaSyns will be screened for their ability to suppress the amber
codons in a GFP gene in the presence of tRNA[proA]. Winners will be determined by FACS.
Figure 2: A library of tRNA[proA] is subjected to negative selection followed by positive
selection to enrich for tRNA[proA] that are completely orthogonal in E. coli (Wang, 2006)
Figure 3: Several rounds of positive and negative selection on a library of aaSyns is performed
in order to generate an enzyme specific for the unnatural amino acid. This figure illustrates the
process for a Tyrosine aaSyn (Wang, 2006).
6 Mo
12 Mo
18 Mo
24 Mo
30 Mo
36 Mo
42 Mo
Hydroxyproline tRNA
construction (10 persons)
Testing and optimization of HP
insertion efficiency with HP tRNA
Identification of aptamer
regulation system (10 persons)
Optimization of aptamer
regulation with respect to HP
concentration
Testing of aptamer regulation
with GFP
Implementation of secretory
system (10 persons)
Testing of secretory system with
GFP
Testing and optimization of
secretory system with collagen
Construction of enzyme plasmid
(10 persons)
Testing and optimization of the
unregulated P4H efficiency, with
respect to co-factor and cosubstrate, free proline etc…
Testing and optimization of
regulated enzyme plasmid
Scale up and commercialization
Figure 4: Predicted timescale for achieving goals throughout the project.
48 Mo
54 Mo
60 Mo
Download