Recombinant Protein Production - Introduction to Expression Systems - Core Facility of Recombinant Protein Production, National Research Program for Genomic Medicine Recombinant Protein Production -Why? • over-expression to get enough amount • easy purification -Application • • • • • functional studies structural studies vaccine/antigen/antibodies therapeutic drug industrial enzymes for reaction Application: Drug Discovery Application: therapeutic proteins • • • • • • • Actimmune (If g) Activase (TPA) BeneFix (F IX) Betaseron (If b) Humulin Novolin Pegademase (AD) • • • • • • • Epogen Regranex (PDGF) Novoseven (F VIIa) Intron-A Neupogen Pulmozyme Infergen •Now more than 200 approved peptide and protein pharmaceuticals on the FDA list (http://www.accessexcellence.org/RC/AB/IWT/The_Biopharmaceuticals.html) Application: structural genomics Bioinformatics Principle in Protein Production Bioinformatics Target identification and cloning Protein expression test Protein purification and production Applications Protein Expression Bottleneck DNA Cloning Expression Purification Enzymology Crystallography • Protein Biochemistry – soluble, purifiable protein • Enzymology – soluble, active protein – 0.1-10 mg of protein • Crystallography – soluble, crystallizable protein – 5-100 mg of protein Bottlenecks to efficient protein expression in E. coli l Inefficient transcription No or little protein synthesized u Promoter choice and design l Inefficient translation No or little protein synthesized u Codon usage u Transcript stability u Transcript secondary structure l Inefficient folding (cytoplasmic or periplasmic) Aggregation or degradation u Improper secondary, tertiary or quaternary structure formation u Inefficient or improper disulfide bridge formation u Inefficient isomerization of peptidyl-prolyl bonds l Inefficient membrane insertion/translocation l Toxicity Cell death Aggregation or degradation Protein Expression and Purification • • • • • Isolation of gene of interest Introduction of gene to expression vector Transformation into host cells Growth of cells through fermentation Isolation & purification of protein Cloning and expression of target gene: Gene of Interest + Expression Vector Expression of Fusion Protein Recombinant Vector Cloning Process • Gene of interest is cut out with restriction enzymes (RE) • Host plasmid (circular chromosome) is cut with same REs • Gene is inserted into plasmid and ligated with ligase • New (engineered) plasmid inserted into bacterium (transform) Cloning (Details) Cloning (Details) protein Recombinant Protein Expression Systems • • • • • • • • • Escherichia coli Other bacteria Pichia pastoris Other yeast Baculovirus Animal cell culture Plants Sheep/cows/humans Cell free Polyhedra Expression System Selection • Choice depends on size and character of protein – Large proteins (>100 kD)? Choose eukaryote – Small proteins (<30 kD)? Choose prokaryote – Glycosylation essential? Choose baculovirus or mammalian cell culture – High yields, low cost? Choose E. coli – Post-translational modifications essential? Choose yeast, baculovirus or other eukaryote Which Vector? • Must be compatible with host cell system (prokaryotic vectors for prokaryotic cells, eukaryotic vectors for eukaryotic cells) • Needs a good combination of – strong promoters – ribosome binding sites – termination sequences – affinity tag or solubilization sequences – multi-enzyme restriction site Plasmids and Vectors • Circular pieces of DNA ranging in size from 1000 to 10,000 bases • Able to independently replicate and typically code for 1-10 genes • Often derived from bacterial “mini” chromosomes (used in bacterial sex) • May exist as single copies or dozens of copies (often used to transfer antibiotic resistance) Key Parts to a Vector • Origin of replication (ORI) – DNA sequence for DNA polymerase to replicate the plasmid • Selectable marker (Amp or Tet) – a gene, when expressed on plasmid will allow host cells to survive • Inducible promoter – Short DNA sequence which enhances expression of adjacent gene • Multi-cloning site (MCS) – Short DNA sequence that contains many restriction enzyme sites A Generic Vector Which Vector? • Promoters – arabinose systems (pBAD), phage T7 (pET), Trc/Tac promoters, phage lambda PL or PR • Tags – His6 for metal affinity chromatography (Ni) – FLAG epitope tage DYKDDDDK – CBP-calmodulin binding peptide (26 residues) – E-coil/K-coil tags (poly E35 or poly K35) – c-myc epitope tag EQKLISEEDL – Glutathione-S-transferase (GST) tags – Celluluose binding domain (CBD) tags Gene Introduction (Bacteria) Bacterial Transformation Bacterial Transformation • Moves the plasmid into bacterial host • Essential to making the gene “actively” express the protein inside the cell • 2 routes of transformation – CaCl2 + cold shock – Electroporation • Typical transformation rate is 1 in 10,000 cells (not very efficient) for CaCl2, but 1 in 100 for electroporation Electroporator 25 microfarads = 2500 V @ 200 ohms for 5 ms Electroporation • Seems to cause disruption in cell membrane • Reconstitution of membrane leads to large pores which allow DNA molecules to enter • Works for bacteria, yeast and animal cells Bacterial Systems Advantages • • • • Grow quickly (8 hrs to produce protein) High yields (50-500 mg/L) Low cost of media (simple media constituents) Low fermentor costs Disadvantages • • • • Difficulty expressing large proteins (>50 kD) No glycosylation or signal peptide removal Eukaryotic proteins are sometimes toxic Can’t handle S-S rich proteins Cloning & Transforming in Yeast Cells Pichia pastoris Pichia Pastoris • Yeast are single celled eukaryotes • Behave like bacteria, but have key advantages of eukaryotes • P. pastoris is a methylotrophic yeast that can use methanol as its sole carbon source (using alcohol oxidase) • Has a very strong promoter for the alcohol oxidase (AOX) gene (~30% of protein produced when induced) Pichia Cloning Pichia Pastoris Cloning • Uses a special plasmid that works both in E. coli and Yeast • Once gene of interest is inserted into this plasmid, it must be linearized (cut open so it isn’t circular) • Double cross-over recombination event occurs to cause the gene of interest to insert directly into P. pastoris chromosome where the old AOX gene used to be • Now gene of interest is under control of the powerful AOX promoter Pichia Systems Advantages • • • • Grow quickly (8 hrs to produce protein) Very high yields (50-5000 mg/L) Low cost of media (simple media constituents) Low fermentor costs More advantages • • • • Can express large proteins (>50 kD) Glycosylation & signal peptide removal Has chaperonins to help fold “tough” prtns Can handle S-S rich proteins Baculovirus Expression Baculovirus Expression • Autographica californica multiple nuclear polyhedrosis virus (Baculoviurs) • Virus commonly infects insects cells of the alfalfa looper (small beetle) or armyworms (and their larvae) • Uses super-strong promoter from the polyhedron coat protein to enhance expression of proteins while virus resides inside the insect cell Baculovirus Expression ~12 days Baculovirus (AcMNPV) Cloning Process Transfer vector Cloned gene 5’ 3’ x x Cloned gene 5’ 3’ Polyhedrin gene AcMNPV DNA Recombinant AcMNPV DNA Baculovirus Systems Disadvantages • Grow very slowly (10-12 days for set-up) • Cell culture is only sustainable for 4-5 days • Set-up is time consuming, not as simple as yeast Advantages • Can express large proteins (>50 kD) • Correct glycosylation & signal peptide removal • Has chaperonins to help fold “tough” prtns • Very high yields, cheap Mammalian Expression Systems Mammalian Cell-line Expression • Sometimes required for difficult-to-express proteins or for “complete authenticity” (matching glycosylation and sequence) • Cells are typically derived from the Chinese Hamster Ovary (CHO) cell line • Vectors usually use SV-40 virus, CMV or vaccinia virus promoters and DHFR (dihydrofolate reductase) as the selectable marker gene Mammalian Expression • Gene initially cloned and plasmid propagated in bacterial cells • Mammalian cells transformed by electroporation (with linear plasmid) and gene integrates (1 or more times) into random locations within different CHO chromosomes • Multiple rounds of growth and selection using methotrexate to select for those cells with highest expression & integration of DHFR and the gene of interest Methotrexate (MTX) Selection Gene of interest DHFR Transfect dfhr- cells Grow in Nucleoside Free medium Culture a Colony of cells Grow in 0.05 uM Mtx Culture a Colony of cells Methotrexate (MTX) Selection Grow in 0.25 uM Mtx Culture a Colony of cells Grow in 5.0 uM Mtx Culture a Colony of cells Foreign gene expressed in high level in CHO cells Mammalian Systems Disadvantages • • • Selection takes time (weeks for set-up) • Cell culture is only sustainable for • limited period of time Set-up is very time consuming, costly, • modest yields Advantages Can express large proteins (>50 kD) Correct glycosylation & signal peptide removal, generates authentic proteins Has chaperonins to help fold “tough” prtns Conclusion • • • • • Isolation of gene of interest Introduction of gene to expression vector Transformation into host cells Growth of cells through fermentation Isolation & purification of protein National Research Program for Genomic Medicine Core Facility of Recombinant Protein Production 重組蛋白質生產核心設施 D1 Expression systems • • • • • E. coli Baculovirus Yeast Cell-free Mammalian cell ( not open for service) Expresssion Systems SYSTEMS Advantages Disadvantages E. coli •Parallel cloning •Fast •Ease of use •Low cost •Poor expression •Low solubility •Lacking post-translational modifications Cell-free •Low protein yield •Faster •Expensive •Skips cell transformation, growing, •Tricky to optimize the lysate and lysis •and expression conditions Yeast •Glycosylation •Efficient Economical •Protein with disulfide bonds •Different glycosylation to mammalian cells Baculovirus •Most proper eukaryotic •Duration of expression limited to infection period •Virus production contains numerous steps •Maintain high virus titers Mammalian cells •Native environment for mammalian •Lower protein yield proteins •Expensive E. coli - the most popular expression system E. coli Expression System -challenge • poorly expressed • protein insoluble- inclusion bodies • expressed and soluble: 20-30% -improvement • • • • growth condition (e.g. temperature) codon usage host strain fusion to carrier protein E. Coli Expression System parallel screening for soluble proteins 1. Increase the expression level and solubility of target protein with protein tags. Rationale 2. Simultaneously, parallel screening different fusion tags. 3. Has potential for automating gene cloning. Publication Protein Science (2002), Shih YP et. al., 11:1714-1719. E. coli Sticky-end PCR E. coli Parallel Gene Cloning E. coli Parallel screening for soluble protein E. coli Statistical analysis of soluble protein ratio E. coli Expression System - Modified version EcoR I Promotor Xho I Terminator Target Fusion tag His*6 Thrombin FXa Protein His*6 To improve consistency and convenience, we now modify the above vectors to include a hexa-His tag and a Factor Xa cleavage site at the N-terminus of each protein expressed in E. coli E. coli 技術比較說明 融合蛋白質的選擇類似,主要是cloning的差別 Hammarström et al. Protein Science (2002), 11:313–321 他人已使用商品化的策略;Gateway Technology (Invitrogen) PCR Ligation Donor vector Purify plasmid Co-transformation 我們使用Sticky-end PCR的方法,不必經過Sub-cloning 即可parallel cloning PCR Denature Re-nature E. coli E. Coli Expression System Summary • The method introduces sticky-end to target genes, without using restriction enzymes. • Well-induced and highly soluble recombinant proteins : 80% success Alternative Expression Systems Baculovirus expression system - EGFP expressed in baculovirus transfected insect cell Bright filed UV merged Cell-free expression system 1 2 3 4 5 6 1: Negative control 2: Positive control (GFP) 3: Hpps component II 4: Hyaluronan synthase 5: Rubber prenyl transferase 6: Marker Yeast expression system 1 2 3 1: Marker 2: N3D TPL-2 using horseshoe crab signal peptide 3: N3D TPL-2 using pichia signal peptide 服務項目介紹 服務 編號 服務名稱 規格 收費 (台幣) D1-1 水溶性重組蛋白質之表達篩選(大腸桿菌 系統) Transformed E coli. strain D1-2 水溶性重組蛋白質之表達篩選(大腸桿菌 系統)技術轉移 依需求訂定 D1-3 酵母菌系統之重組蛋白表達篩選 Pichia system 27,500 D1-4 無細胞之重組蛋白表達篩選 (使用本系統專用載體) Cell free system 18,500 D1-5 無細胞之重組蛋白表達篩選 (自備質體) Cell free system (自備質體) 7,300 D1-6 桿狀病毒系統之重組蛋白表達篩選 Baculovirus expression system 36,300 14,000 http://proteome.sinica.edu.tw/prod_services_01.asp SYSTEMS Advantages Disadvantages E. coli (14,000 NT$) •Parallel cloning •Fast •Ease of use •Low cost •Poor expression •Low solubility •Lacking post-translational modifications Cell-free (18,500/7,300 NT$) •Faster •Skips cell transformation, growing, and lysis •Low protein yield •Expensive •Tricky to optimize the lysate •and expression conditions Yeast (27,500 NT$) •Glycosylation •Efficient Economical •Protein with disulfide bonds •Different glycosylation to mammalian cells Baculovirus (36,300 NT$) •Most proper eukaryotic •Duration of expression limited to infection period •Virus production contains numerous steps •Maintain high virus titers Mammalian cells •Native environment for mammalian proteins •Lower protein yield •Expensive Flow chart of protein production Service Requested Parallel Cloning Expression test in E. coli additional charge standard Insoluble / posttranslational modification required Soluble Yeast system Baculovirus system in vitro expression systems Protein Purification Protease cleavage to remove tag Self-cleavage of fusion protein in vivo using TEV protease to yield native protein -challenge to fusion protein method separation of passenger target protein from the fusion carrier • fusion carriers cannot be processed by proteolysis • cleaved products aggregate immediately • cleaved products contain extraneous a.a. residues -our approach • TEVP intracellular processing system tobacco etch virus protease (TEVP) -Glu(P6)-P5-P4-Tyr(P3)-P2-Gln(P1)- -P1'- TEVP intracellular processing system In vivo cleavage of fusion proteins. TEVP intracellular processing system Different amino acid residues at the P1' position - more effective than an intermolecular enzymatic reaction - even with Pro in the P1' position TEVP intracellular processing system all six vectors successfully carried out intracellular cleavage TEVP intracellular processing system Summary • introduce cloning sites to target genes, without using restriction enzymes. • produce native proteins with original amino termini in vivo via intracellular self-cleavage • skip tedious optimization of cleavage conditions