1 10. Expression vectors and production systems. Because

advertisement
1
10. Expression vectors and production systems.
Because production of a cloned protein yields large quantities of the product, the yield
is readily analysed and visualized by SDS-PAGE.
Induced versus uninduced expression may be compared, as well as the compartment and
state of production : intracellularly vs periplasmatically vs extracellularly (secretion),
and soluble or insoluble. (The latter aspects are not shown in the Figure.)
1. Basic aspects
- regulation
- high expressions level may
- cause toxicity (of the protein product) to the cell
- cause "metabolic drain" : disturb the metabolism
=> any change that reduces the expression level (e.g. by mutations, loss of
plasmid, ...) will give the cell a growth advantage
- to avoid toxicity : grow the cell population to appropriate level
before the synthesis of the “cloned” protein assumes
=> i.e. induce expression (and until then, keep it repressed)
- induce expression in late-logaritmic phase (with cells still
metabolically active)
- regulation is primarily at the trancriptional level
- other factors: translation, stability (vector, transcript, product)
- transcription:
- E.coli RNA polymerase : 2, , '
general promoter structure (‘consensus’ sequences)
70 promoter : TTGACA (16-19nt) TATAAT (5-8) start (+1)
(zones / optimal distances)
-35
(17)
-10
(7)
UP-elements can have an extra influence on efficiency
2
- initiation : efficient promoters and their regulation
- lacUV5 : IPTG, LacI, LacIq, titration, 'leakiness'
- trp : tryptophan + aporepressor (TrpR), IAA induction
- Ptac : upstream –20 Plac + downstream –20 Ptrp
(cfr. PtacI, PtacII, Ptrc, and other variants)
induction by IPTG
- PL, PR : repressor CIts857 : induction by temperature shift
or : CIwt + induction by mitomycine C
- T7 : delivery of T7 RNA polymerase required
- coded onto another (compatible) plasmid
- encoded by a  prophage (cIts) in a lysogenic E.coli strain
- by infection of a recombinant T7-gene1 clone in M13 vector
- araBAD : arabinose + AraC : positive and negative regulation
=> PL and PBAD are blocked very efficiently, Plac (and its derivatives)
are ‘leaky’; with phage T7 promoters, all interference with the
E.coli polymerase is avoided.
- termination
- factor-dependent (Rho (, Tau (), NusA (of phage ))
- factor-independent (GC-rich stem-loop structure + oligoT-stretch)
often used in (most) expression vectors are :
- the fd terminator (in Ff phages between gIII and gVIII, see figure in ch.4)
- the rrnB operon terminators T1 en T2
- the phage T7 terminator Tf
- translation
- initiation: RBS (ribosome binding site), AUG (or GUG, 91 vs 8% in E.coli)
- sequence motifs : Shine-Dalgarno sequence
RBS > < Shine-Dalgarno (SD) motif
3
RBS = about 55 nt between positions –35 and + 22 (with respect to AUG)
SD : 16S rRNA
5'
….GUACACCUCCUAOH 3' (E.coli)
basic pattern : 5'….AGGA… +/- 7nt … AUG(start)… 3'
efficiency is quite unpredictable
variations : GAGG, GGAG, GGAGG
- secondary structure
the AUG (GUG) initiation triplet should be in a readily accessible
region, e.g. at the top of a stem-loop structure
triplets following AUG influence efficiency (as well as the preceding nt)
(and obviously also the secondary structure)
- codon usage
- the effect of codon usage is very complex
- same amino acid => multiple codons
- same codon => multiple tRNA's
- different codons => same tRNA
- codon-anticodon binding strength
- comparison of codon usage frequencies of highly expressed genes versus
poorly expressed genes, hints at the effect of some particular triplets
on the expression yield (see Table)
- with synthetic genes : using the 'optimal' codewords usually gives
a good expression level (though not necessarily the highest one)
- gene dosis effect
- increasing copy number may have
- no effect
- a positive effect
- a negative effect (e.g. the expression of trypsin)
=> can only be determined empirically ; no general rule
4
- stability
- transcription termination signal beyond the target gene is required for
plasmid stability
- par locus (or loci) for plasmid stability (important at the time of induction)
- transcript stability : mRNA degradation is a complex process
- both 5' and 3' UTR play a role (e.g. a stem-loop structure in 5' UTR
gives stabilisation
- there is no inverse correlation between the size and half-life of an mRNA
hence: degradation not dependent on a-specific endonucleolytic cleavage
- protein stability : degradation is strongly regulated in E.coli: there is a
large number of proteases in the cytoplasm, periplasm and at the inner
and outer membranes.
N-rule of Varshavsky:
- R, K, L, F, Y, W : half-life of 2 min on a test protein
- other amino acids (except P) : half-life of 10 hours
Initiating Met:
usually the Metformyl (first amino-acid) is cleaved off.
this seems to work most readily if the 2nd amino acid has a short side-chain
(facilitates the cleavage by the methionine aminopeptidase)
Stress conditions cause induction of certain proteases
a.o. protease La (lon gene)
these are under the control of 32 promoters :
- rpoH : gene for the RNA polymerase 32 subunit
- mutants in rpoH can give a dramatic increase of expression
yield (and product stability)
There are many more specific observations that may be taken into account
5
2. Major expression systems (induction and regulatory circuits) some examples :
- the pET system
- with T7 RNA polymerase,
- extra regulation (repression) by LacI (IPTG induction)
- delivery of T7 RNA polymerase from an additional expression
unit on a  (DE3) prophage with PlacUV5 as promoter
(also under LacI control : lacO as operator sequence)
- additional control by lysozyme T7 LysE (or S) encoded on a
compatible plasmid (pACYC-derived)
=> the lysozyme binds to and inhibits (residual) T7 RNA polymerase
- the binary trp-cI system
- double system using compatible plasmids
- target gene downstream of the PL promoter
- synthesis of CI (coded by the cI repressor gene) is regulated by Ptrp
- addition of tryptophan activates expression of the target gene by shutting
down CI synthesis
(as seen above : regulation by CIts857 is also possible, but: this regulation is less
strict (wt-CI binds more efficiently) and addition of an inductor in large
fermentors is easier that increasing the temperature ; moreover the
temperature shift would induce stress mechanisms (heat-shock).
- pBAD system
- dimer AraC binds to I1 and O2 operator sites
(the DNA loop blocks transcription)
- dimer AraC + arabinose : binds to the I1 en I2 sites
- this is catabolite repression sensitive : CAP + cAMP
(cAMP = is low if glucose concentration is high)
=> interplay of arabinose and glucose concentrations might be used
to control the level of expression
(nb. CAP-regulation is also present in the Lac operon, but tac promoters are not CAP-dependent)
6
3. Production strategies
E. coli : is the major ‘first line’ organism for expression
=> disadvantages (limitations)
- no (few) post-translational modifications
- virtually no efficient secretion routes towards the medium
(secretion brings the protein into the periplasmic space)
- extensive S-S bridges are difficult to form
Why recombinant expression ?
=> increase production yield
=> facilitate purification
=> create novel variants (mutants, insertions, fusions, etc.)
The expression product may be :
- soluble or insoluble
- mature protein or fusion product
- accumulate in the cytoplasm or be secreted into the periplasm (type II secretion)
(with E. coli, extracellular secretion is very limited, e.g. type I secretion based on haemolysin transport)
Expression product :
- as mature protein
- may be deposited in inclusion bodies
- allows easy purification
- protects against break-down
- requires solubilisation and refolding
=> denature (in e.g. guanidinium hydrochloride) and (try to) renature
- retarding the synthesis rate may reduces this process (in part)
(e.g. by lowering the temperature)
- the expression construct requires critical manipulation of the ribosome
binding region to warrant efficient (high-level) translation initiation
- the MET-problem : is the initiator (formyl-)methionine removed?
7
- as fusion protein (e.g. random insertions at ScaI site of cat : AGTACT)
- N-terminal and C-terminal fusions are possible (or multiple partners)
- advantage : translation initiation efficiency may be largely retained
(in C-terminal fusions)
- fusion partner can be a target for purification (tag)
- fusion partner may have activity to assay production level
- fusion to a signal peptide sequence can promote secretion
=> secretion allows formation of (at least some) S-S bridges
- fusion partner may allow anchoring in the membrane
- an epitope recognized by a specific antibody may be added
(for detection or quantification or purification)
- choice of intracellular or periplasmic expression (secretion)
Trying to avoid (or to reduce) the formation of inclusion bodies
(improve chances of protein folding)
- grow cells at reduced temperature (retards the expression process)
(or use other conditions that retard growth, e.g. composition of the media & pH values)
- co-expression of chaperones : DnaK, GroES, GroEL
- removing critical amino acid positions (e.g. in interferon)
- fusion to thioredoxin, or some other proteins
Secretion : is a specific kind of fusion
- the fusion partner (signal peptide) is removed during secretion
- no methionine remains at the N-terminus
- position of cleavage is not always guaranteed
=> different secretion signals may have to be tried : from OmpA,
OmpT, PelB, -lactamase, alkaline phosphatase, etc. Some of
these may allow leakage to the growth medium when overexpressed.
- !!! but not all proteins are ‘secretable’
8
In general :
1) small peptides : often (peptide) stability problem :
=> fusion approach preferred : e.g. LacZ fusions
2) intermediate size:
few S-S : intracellular
more S-S : periplasmic (secretion)
3) large proteins : more problematic
4. Purification and processing
Fusions to allow easy purification : e.g. glutathione, MalE (MBP : maltose binding
protein), oligo-His (hexa-His), etc. : partners (carriers) or tags. Purification
by affinity chromatography or by immobilisation followed by elution.
Release of the tag/partner by cleavage with factor Xa, enterokinase, etc.,
or the use of intein processing.
Some examples :
1. C-terminal fusion in pBAD/His vectors
- six Histidine triplets
- insertion in correct reading frame : BglII site
- binding to a column with immobilized Ni2+
- cleavage site for enterokinase (D-D-D-D-K* ) (* = cleavage site)
(n.b. some residual amino acids left at N-terminal after processing)
2. C-terminal fusion to biotin carboxylase
- the biotin is covalently attached to a lysine by biotin ligase (BirA)
(E. coli has one biotinylated protein but it does not bind to streptavidin in its native configuration)
- immobilisation to streptavidin (batchwise or by affinity chromatography)
- factor Xa cleavage site : I-E-G-R* or I-D-G-R* (followed by not-P and not-R)
(* = cleavage site)
(secondary sites usually G-R*)
3. processing by intein
- N-terminal fusion to intein + chitin binding domain
=> self-cleavage at the N-terminus of the intein by thiol compounds
9
- expression from the T7 promoter, fused to lacO
=> induction with IPTG
- bind fusion product to the chitin column
- wash the column, then equilibrate the column with a DTT solution
=> in situ cleavage at 4°C overnight
- elute the target protein
- remove the fusion partner (intein-CBD) by SDS
4. MalE-fusions : vectors pMAL-c2 and pMAL-p2
(with (c) or without (p) the signal peptide sequence of the malE gene :
cytoplasmic or periplasmic expression)
- expression unit between Ptac promoter and rrnB terminators
lacIq on the vector to provide sufficient quantity of repressor
- fusion between malE carrier and lacZa, separated by 10xD and
MCS for insertion of the target gene.
This in-frame construct produces blue colonies on BCIG-media.
The polar 10xD peptide separates the two protein moieties.
Cloning in MCS produces white colonies.
- affinity purification onto a amylose column :
following elution by maltose and factor Xa cleavage, the fusion
partner is removed by a second chromatography on amylose.
- MCS contains an XmnI site at the left site, which overlaps the coding
sequence of factor Xa and allows an exact fusion to the target gene
sequence. Other insertion sites leave some extra amino acids
at the N-terminal after factor Xa cleavage.
- The EcoRI cloning site lies in the same reading frame as the EcoRI
site in lacZ : exchanging a cloned gene from gt11 is easy.
5. Optimalisation strategies
- C-terminal fusion to a reporter gene to assay expression yield
- monitoring expression level variation upon (minor) modifications
Download