LECTURE 15 EUKARYOTIC EXPRESSION VECTOR LY N N C A L L I S O N CHEMICAL BIOLOGY DR. KUANG-YU CHEN RUTGERS UNIVERSITY I. OVERVIEW II. RECOMBINANT GENE EXPRESSION III. OPERON CONTROLS GENE EXPRESSION IV. EUKARYOTIC EXPRESSION SHUTTLE VECTOR V. TAGGING SEQUENCE OVERVIEW A. POLYHISTIDINE AND EPITOPE TAGS B. GREEN FLUORESCENT PROTEIN TAG VI. TRANSFECTION TECHNOLOGY OVERVIEW A. CHEMICAL TRANSFECTION B. PHYSICAL AND BIOLOGICAL TRANSFECTON VII. CURRENT NEWS: GENE THERAPY VIII. WORKS CITED 1 OVERVIEW An expression vector is used to express a cloned gene in a host cell. The vector contains regulatory sequences that direct the host cell to transcribe the foreign gene into messenger RNA (mRNA). The resulting RNA is then translated into protein (fig1). Expression vectors are prepared as plasmids, phages, or phagemids depending on the desired efficiency. For expression of human proteins, the vector requires regulatory sequences such as promoters that recruit the RNA polymerase in both prokaryotes and eukaryotes. Human expression vectors begin in bacteria host cells for ease of use and end in yeast host cells for eukaryotic post translational modifications. The cloned gene is often engineered to contain an additional sequence that allows it to become purified and quantified in the laboratory. Figure2 depicts a protein that was cloned with an attached green fluorescent protein (GFP) sequence. Expression vectors are useful for the development of medical treatments. These vectors have been used to prepare in vitro human insulin for the treatment of diabetes. Expression vectors are designed to produce Figure 2:Image of structural proteins bound to large amounts of one desired segment of GFP mRNA and resulting protein on command. (4, 12, 13) Figure 1: Overview of gene expression vector. Numerous processes are required for preparation of both the target DNA and target protein 2 RECOMBINANT GENE EXPRESSION Expression vectors are prepared by standard cloning techniques. The expression vector target DNA is prepared from target mRNA (fig3). The cell transcribes DNA to RNA and then translates RNA to protein. If the amino acid sequence of the desired target protein is known, the genetic code can be used to translate the amino acid sequence to a nucleotide sequence. From this nucleotide sequence, a complementary radioactively labeled probe can be prepared to identify the complete mRNA in the midst of a cell’s contents. Once the complete protein encoding mRNA is obtained, Reverse transcriptase uses the mRNA sequence to generate a complementary DNA (cDNA) sequence. The RNAse enzyme is used to digest the phosphodiester back bone of the mRNA. Once the mRNA is removed from the single stranded cDNA, reverse transcriptase is used to prepare a double stranded cDNA segment. For introduction into the vector, ligase enzyme is used to attach synthetic ‘linker’ sequences to the ends of the double stranded cDNA. These ‘linkers’ supply ‘sticky-end’ sequences for annealing with a restriction enzyme splice site. Ligase is used again to attach the cDNA to the ‘cloning site’ of the vector. The ‘cloning site’ consists of an assortment of endonuclease splice site. It supplies a source of potential ‘sticky ends’ for the attachment of a target gene. The ‘Splice site’ is strategically located downstream of regulatory sequences. The promoter regulatory sequence recruits RNA polymerase to the site. RNA polymerase binding causes all of the downstream DNA sequences to be transcribed into RNA. The adjacent ribosome binding regulatory sequence causes the transcribed RNA to contain a nucleotide sequence for ribosome attachment and enables initiation of translation of RNA into protein. For the purpose of efficient protein expression in a host cell, each vector must contain a replication origin (ori) for recruitment of DNA polymerase when the DNA is replicated during cell division. Additionally, each vector has an antibiotic resistance gene (Ampi) to enable scientists to colonize host cells on antibiotic rich medium and identify the cells that have successfully incorporated the vector and have high probability of producing gene product target protein. (9, 13) Figure 3: (left) Procedure for preparation of target DNA in cloning vector (right) regulatory sequences required to enable the vector to form target protein 3 OPERON CONTROLS GENE EXPRESSION An operon is a genetic regulatory element that regulates gene expression. The operon sequence controls whether the target gene is constantly translated into protein (constitutive) or if it is triggered into making protein in command by the presence of a stimulating molecule (induced). For the purpose of laboratory efficiency, expression vectors are inducible, also known as regulated. The basic expression vector contains a replication origin (Ori), a selectable antibiotic-resistance gene, and a strong-regulated promoter. This section uses the lac operon to discuss the means by which the operon regulates and induces the promoter. (2) The lac operon consists of a repressor structural gene (lacI), promoter (P), operator (O), structural genes (Z, Y, and A), and transcription/translation termination sequences (fig4). The promoter is a DNA sequence in the operon that RNA polymerase binds to transcribe the DNA sequence into RNA sequence. The operon is repressed by binding of the lac repressor enzyme (encoded at I) to the operon (O). Constituently transcribed and translated, the lac repressor enzyme prevents transcription by binding at the operon and preventing the forward movement of RNA polymerase. In the presence of lactose, the lac repressor enzyme disassociates from the operator. The RNA polymerase then proceeds to transcribe the Z, Y, and A structural genes. The Z, Y, and A structural genes are separated by ribosome start and stop codons. This enables the ribosome to translate three separate enzymes that promote lactose metabolism. LacZ, encodes the enzyme, β-galactosidase for hydrolysis of lactose to glucose and galactose. LacY, produces permease, for the transport of lactose into the cell. lacA, codes for transacetylase, whose role in lactose metabolism is unknown. Overall, the expression of lactose metabolism enzymes is “switched on” (regulated) by the presence of lactose. (1, 2) The structure of the lac operon is frequently used to regulate the expression vector promoter. Inducible expression such as that observed in the presence of lactose in the lac operon is beneficial for two reasons. Firstly, high levels of a foreign protein can be toxic to the host cell. Secondly, expression of a foreign protein at a constant level can sequester valuable host cell energy and prevent the population of host cell from growing to enough quantities to enable a sufficiently large harvest of protein product. In the laboratory, scientists use the lac operon concept by constructing a vector that carries the lacI repressor gene upstream of a promoter, operator and target gene for desired protein. In this way, scientists can supply or deprive the host cell of lactose (or synthetic analog, IPTG (Isopropyl β-D-1-thiogalactopyranoside)) to turn “on” and “off” transcription of the gene for the target protein. (4) Figure 4:Diagram of the Lac Operon as it is transcribed into mRNA and translated into protein 4 EUKARYOTIC EXPRESSION SHUTTLE VECTOR Expression vectors produce the target protein of a cDNA gene. They consist of an origin of replication, promoter, postranslaitonal modification signals, antibiotic resistance genes, and the cDNA target gene. They utilize, strong inducible promoter to produce the largest possible amount of target protein on an application of a chemical stimulus (e.g. IPTG). Furthermore, eukaryotic expression vectors are also known as shuttle vectors. Shuttle vectors enable efficient cloning and optimum amount of active target protein/enzyme production. (1, 2) Protein expression vectors are also known as shuttle vectors. Shuttle vectors can replicate in both prokaryotes and eukaryotes and must possess an origin of replication for each. Although prokaryotic cells are more convenient for use in initial cloning and transformation steps, eukaryotic host cells apply post translational modifications that produce cloned human proteins of high enzymatic activity. When cloning eukaryotic proteins, eukaryotic host cells cause the posttranslational modifications of phosphorylation and glycosylation that are specific to eukaryotes. Furthermore, eukaryotic host cells possess an intercellular environment that optimizes protein folding and minimizes aggregation. Overall, shuttle vector cloning sequences are optimized in prokaryotes and are then transferred to eukaryotic cells to harvest a eukaryotic protein of optimal activity. A eukaryotic protein expression vector (fig5) must contain: (1, 2, 7) Figure 5: Schematic of Eukaryotic Expression Vector 1. Prokaryotic and Eukaryotic Origin of Replication f1 origin (ori) Allows rescue of single-stranded DNA S. cerevisiae (eukaryote) SV40 origin (ori) Allows efficient, high-level expression of the neomycin resistance gene and episomal replication in cells expressing SV40 large T antigen Puc origin (ori) High-copy number replication and growth in E. coli (prokaryote) 2. Strong inducible promoter up stream of a multiple cloning site (MCS) SV40 early promoter (PSV40) Allows efficient, high-level expression of the neomycin resistance gene and episomal replication in cells expressing SV40 large T antigen Human cytomegalovirus (PCMV) immediate-early promoter/ enhancer–Permits efficient, high-level expression of your recombinant protein T7 promoter/priming site ---Allows for in vitro transcription in the sense orientation and sequencing through the insert Multiple cloning site in forward or reverse orientation---Allows insertion of your gene and facilitates cloning 3. Posttranslational modification signals Bovine growth hormone (BGH PA) polyadenylation signal (PA) Efficient transcription termination and polyadenylation of Mrna SV40 early polyadenylation signal (SV40 PA) Efficient transcription termination and polyadenylation of Mrna 4. Antibiotic resistance gene Neomycin resistance gene Selection of stable transfectants in eukaryotic cells Ampicillin resistance gene(β-lactamase)–Selection of vector in prokaryotic cells 5. Target gene Cdna to encode complete eukaryotic target protein 5 TAGGING SEQUENCE OVERVIEW Once the vector is cloned and successfully transcribes mRNA, the resulting target protein is purified through tagging techniques. The tag consists of an additional amino acid sequence that improves purification through intercellular signaling, green fluorescent protein (GFP) fluorescence, metal ion coordination, or antibody recognition. In the laboratory a tag is added during vector preparation by combining the gene sequence for target protein with that of the tagging protein without an intervening ribosome stop command (fig6). The result is one long peptide that has target protein and tagging protein features. (4, 13) Naturally fused amino acid targeting sequences are present in cells to divert proteins to the necessary organelles. They can send proteins to the endoplasmic reticulum, mitochondria, chloroplast, peroxisome, or nucleus. These amino acid sequences are listed in figure 7. Also known as a topogenic tag, this sequence of residues ensures that the peptide is in the correct orientation when incorporated into the organelle plasma membrane. In the laboratory, scientists utilize the endoplasmic reticulum or golgi apparatus targeting sequences to cause the localization of target proteins to secretory vesicles. By initiating the secretion of the cloned target protein with a targeting tag, scientists can easily harvest and separate the protein from other cellular biomolecules. (4, 9) Figure 6:Transcription and translation of fusion protein Figure 7: Targeting sequences that direct proteins to organelles Target Location in Organelle Protein Nature of Signal “Core” of 6–12 mostly hydrophobic amino acids, often preceded by one or more basic amino Endoplasmic N-terminal reticulum acids Mitochondrion N-terminal Chloroplast N-terminal 3 – 5 nonconsecutive Arg or Lys residues, often with Ser and Thr; no Glu or Asp residues No common sequence motifs; generally rich in Ser, Thr, and small hydrophobic amino acid residues and poor in Glu and Asp residues Peroxisome C-terminal Nucleus Internal Usually Ser-Lys-Leu at extreme C-terminus One cluster of 5 basic amino acids, or two smaller clusters of basic residues separated by ≈10 amino acids 6 POLYHISTIDINE AND EPITOPE TAGS A polyhistidine tag is useful for purification of the cloned protein. The tag is a sequence of approximately five histidine residues translated at the N-terminal end of the target protein. It is prepared in the vector by including the polyhistidine sequence upstream of the cloned target protein without an intervening ribosome stop sequence. The polyhistidine tag enables purification by affinity chromatography with nickel ion (Ni2+) bound sepharose gel (fig8). After the host cells are collected and lysed, the target protein is released for purification. The nickel ion of the sepharose coordinates with the histidine residues and causes the target protein to stay on the column. It is released when the column is rinsed with a high concentration of histidine residues that act as competitive ligand for the nickel ion. The tagged target protein is then separated from the polyhistidine via cleavage at a enterokinase sequence. A final affinity column purifies the mixture by removing the polyhistidine tag. (13) Antibody recognition of epitope tagging is used for recombinant protein purification. An epitope is the portion of an antigen that is recognized by antibodies. Eukaryotic expression vector DNA is prepared to include coding for a common epitope. Common epitopes are recognized by antibodies that can be obtained commercially. Common epitopes are short protein sequences derived from full proteins (fig9). This system allows light or electron microscope analysis via use of an epitope-atibody in conjunction with a labeled secondary antibody (fig10). Furthermore, the epitope tagged protein can be purified by immunoprecipitation, immunoaffinity chromatography, or isolated in a colony. (13) Figure 10: Colony isolation of peptide bearing an epitope tag Figure 9:Common epitope tag sequences Tag HIS c-MYC Figure 8: Affinity column purification of peptide with polyhistidine tag HA VSV-G HSV V5 Sequence HHHHHH EQKLISEEDL YPYDVPDYA YTDIEMNRLGK QPELAPEDPED GKPIPNPLLGLD 7 GREEN FLUORESCENT PROTEIN TAG Green fluorescent protein (GFP) is used as a fluorescent tag fused to the target protein. The location of the GFP fusion protein can be tracked via fluorescence microscopy the in a in a living or dead cell. Localization of the target protein provides information about its function. The fusion protein is prepared in the expression vector by placing the GFP upstream of the target protein without an intervening ribosome stop sequence (fig11A). Originally found in jelly fish, GFP consists of 238 residues which make up a β-barrel structure that surrounds the chromophore residues: Tyr66 and Ser65 (fig11B). GPF is photoactivated by an exposure to 405nm light that causes decarboxylation of Glu222 and excitation of the pi-electron clouds of Tyr66 and Ser65 (fig11C). This light emission is observed by fluorescence microscopy and allows scientists to observe the location of the target protein in the cell. (7, 11, 13) (A) (B) (C) Figure 11:(A)Vector diagram for a target protein that contains a GFP tag , myc epitope tag, and three repeated sequence of nuclear targeting sequence (B) Rendered image of GFP (C) Photoactivation reaction of essential residues in GFP 8 TRANSFECTION TECHNOLOGY OVERVIEW Introduction of foreign DNA into eukaryotic cells is called transfection. Cells exposed to DNA are coerced into taking up the DNA. Normally, the foreign DNA vector is a transient transfection and is only temporarily present in the host cell population. Transient transfection is useful if the protein gene produce can be harvested quickly. It supplies a temporary, high level of gene expression for approximately 1 to4 days following transfection. However, if laboratory procedures require a cell host cell that can dispense cloned gene product for a long period, stable transfection is necessary. Stable transfection occurs when the cloned gene is incorporated and expressed in the host genome and retained with each cell division cycle. This occurs at a <0.1% frequency. The stably transformed cells are identified through use of a selectable marker. A selectable marker is a genetic sequence that is included in the vector that gives the host cell an identifiable trait such as drug resistance or metabolomic capabilities. Several common selectable markers are listed in figure 12. The pathway of stable transfection in figure13, shows that the circular vector DNA contains the target gene in red and the selectable marker in blue is integrated into the linear host genome during the last step. Note that calcium phosphate is used to coprecipate the DNA during the initial transfection phase. Calcium phosphate is one type of several chemical, physical or biological methods of transfection. (4) Figure 12: Selectable Markers for Stable Transfection Aminoglycoside phosphotransferase (APH; neoR gene) – Neomycin resistance Adenosine dreaminase (ADA) – enzyme involved in purine metabolism. It is needed for the breakdown of adenosine from food and for the turnover of nucleic acids in tissues. Dihydrofolate reductase (DHFR) – enzyme that reduces dihydrofolic acid to tetrahydrofolic acid, using NADPH as electron donor, Thymidine kinase (TK) – enzyme with a key function in the synthesis of DNA and cell division, as they assist the introduction of deoxythymidine into the DNA. Xanthine-guanine phosphoribosyl transfersase (XGPRT; gpt gene)- a purine salvage enzyme Figure 13:The process of stable transfection; beginning at the vector and ending at the cell 9 CHEMICAL TRANSFECTION Chemical transfection of expression vectors into eukaryotic cells is mediated by calcium phosphate, liposome, DEAE-dextran, or dendrimers, Coprecipitation of the DNA with calcium phosphate is an inexpensive method of chemical transfection. Calcium phosphate is shown in step one of figure13, and as a structure in figure14A. The mechanism of calcium phosphate transfection is not completely understood. However it is believed that a precipitate of calcium phosphate salt Ca3 (PO4)2 promotes adherence DNA to the surface of the crystal. This DNA coated salt is then applied to the host cell, causing it to take up a portion of the foreign DNA. Lipofection, or liposome mediated transfection, is another common method of chemical transfection. The molecules of lipofection consist of a cationic head and a hydrocarbon tail (fig14B). They are similar in structure to the phospholipids of the cell membrane. They function by surrounding the foreign DNA and facilitating its fusion into the cell membrane. Once the complex is associated with the host cell membrane, the DNA is released into the host cell. Diethylaminoethyldextran (DEAE-dextran) functions similarly to lipofection. As shown in figure14C, DEAE-dextran possesses a sugar chain linked by an O-glycosidic bond to a cationic tail. The negatively charged DNA binds to the cationic tail. The DNA bound molecule possesses excess cationic activity and uses this polarity to become enmeshed in the phospholipids bilayer. Evidence suggests that the dextran-DEAEDNA complex enters the cell via endocytosis. Dendrimer technology further elaborates on the chemical aggregation method of DNA transfection. As shown in figure14D, dendrimers are branched organic molecules with polar functional groups bound to the end of each branch. The structure is synthesized via a series of substitution reactions. It is engineered with the goal of encapsulating DNA molecules. The positively charged amino group at each branch of the dendrimer is designed to interact with the negatively charged phosphate group of DNA. Overall, the chemical method of transfection exploits the polar nature of DNA to form a chemical aggregate that is compatible with the hydrophobic and hydrophilic interactions of the phospholipids bilayer. (4, 14) A) B) C) Dextran D) Figure 14: Chemical Methods of Transfection (A) Calcium phosphate (B) Liposomes (C) DEAEdextran (D) Dendrimer 10 PHYSICAL AND BIOLOGICAL TRANSFECTION Physical transfection methods employ scientific instruments to introduce foreign DNA into a eukaryotic cell. Physical methods include electroporation, ballistics, and microinjection. Electroporation is the application of an electric field to a mixture of host cells and foreign DNA. The electrical field is believed to disrupt the phospholipids bilayer of the host cell and permit access to the foreign DNA. The left side of figure15A depicts the electroporation apparatus. The right side of the figure shows an idealized membrane pore formation. However, the exact structure of the membrane pores resulting from electroporation cannot be confirmed. A ballistic particle delivery system, or gene gun, is able to transform eukaryotic cells as well as prokaryotic and plant cells (fig15B). It “injects” DNA by propelling a DNA-coatedmetal-particle into the target cell. Gene guns can introduce DNA to any part of the host cell’s nucleus or organelles. In addition to cloning of eukaryotic expression vectors, microinjection is used for cloning of an entire organism. It uses a glass micropipette to insert DNA into the nucleus of a single living cell. This procedure requires a “specialized optical microscope”, holding-pipette, and micropipette of 0.5-5.0 Μm diameter (fig15C). Physical transfection methods require expensive equipment and are typically used after chemical transfection has proved ineffective. (14) A) B) C) Figure 15: Physical Transfection (A) Electroporation (B) Gene gun (C) Microinjection Figure 16: Biological Transfection: lentiviral vector Biological transfection, also known as infection uses lentiviral vectors to introduce cloned DNA into eukaryotes. A lentiviral vector is developed by removing the pathogenesis genes from a retrovirus. They are very efficient for biological transfection in eukaryotic cells. Lentiviral vectors are useful for stable transfection because of the enzyme, integrase. After infecting the host cell, integrase functions to incorporate the foreign DNA into the DNA of the host’s genome. Figure 16 shows that retroviral vectors are cloned in host cells and can then be isolated as virus particles for an infection that results in a stable transfectant Because of its efficiency, retroviral vectors have use in gene therapy. Gene therapy is the introduction of foreign genes to correct malfunctioning disease causing genes. (4, 13) 11 CURRENT NEWS: GENE THERAPY LENTIVIRAL EXPRESSION VECTOR FOR PARKINSON’S DISEASE Conclusive evidence indicates that lentiviral vectors can be used to treat the symptoms of Parkinson's disease. These symptoms are caused by degeneration of the dopaminergic neurons of the substantia nigra (SN) section of the basal ganglia in the brain. Glial cell linederived neurotrophic factor (GDNF), acts as a neuroprotective agent that can treat the diseased neuron cells of Parkinson's disease. Scientists prepared a lentiviral expression vector that is designed to produce GDNF (lenti-GDNF)in the SN.For experimental purposes, the disease model is prepared by injecting adult monkeys with the compound,1-methyl-4-phenyl-1,2,3,6tetrahydropyridine (MPTP), to cause motor defects. The experiment proceeded by injecting the monkeys with lentiviral vectors for GDNF or β-galactosidase as a control. The monkeys were then subjected to a hand-reach task (fig17) before biopsy (fig18). During the hand-reach task, lenti-GDNF treatment was shown to improve the reactivity of MPTP monkeys. Furthermore, biopsy indicates that lenti-GDNF prevented SN degradation. (8) Figure 17: Results of hand-reach task shows lenti-GDMF treated monkies perform better up to 3 months Figure 18:SN neuron image of (C) more neurons in lenti-GDNF treated subject and (D) less neurons in untreated subject. 12 WORKS CITED (1) Berg, J. et. al. Biochemistry, 5th ed. W H Freeman and Company, New York, (2002) 27.2. (2) Campbell, et. al. Biology, 5th ed. Addison Wesley Longman, Inc., New York, (1999) 284293. (4) Cooper, G. M. The Cell - A Molecular Approach, 4th ed. Sinauer Associates, Inc, Sunderland MA, (2007) 201-226. (5) Garett and Grisham. Biochemistry 2nd ed. Thomson: Brooks/Cole: United States (1999) 984-1037. (7) Invirogen: life technologies. “pcDNA” 2010, Mon 3 May < http://www.invirogen.com> (8) Kordower, J H. et al. "Neurodegeneration Prevented by Lentiviral Vector Delivery of GDNF in Primate Models of Parkinson's Disease." Science 290, 767-773 (2000). (9) Lodish et. al. Molecular Cell Biology. W H Freeman and Company, New York, (1999) 11.6. (10) Papale A. “Viral vector approaches to modify gene expression in the brain” Journal of Neuroscience Methods 185, 1-14 (2009) 1–14. (11) Runions, John, et. al. "Photoactivation of GFP reveals protein dynamics within the endoplasmic reticulum membrane" Journal of Experimental Botany 2006 57(1):43-50 (12) Strachan, T. et. al. Human Molecular Genetics, 2nd ed. Garland Science, New York, (1999) 22.3.2. (13) Weaver, Robert F. Molecular Biology, 4th ed. McGraw Hill Higher Education: New York, (2008) 640-717. (14) Wikipedia 2010, Mon 3 May “Transfection” <http://en.wikipedia.org/wiki/Transfection > 13