Ghosh Lab University of Arizona Department of Chemistry Outline •Cloning overview •pDRAW32 • Design •Gene •Insert •Primers •Further considerations (optimization of the process) •Transformation Cloning Overview Four main steps in cloning: •Insert synthesis •Restriction enzyme digest •Ligation •Transformation + Plasmid (vector) Insert (your gene) Functional construct Design Overview Steps to follow in designing your cloning experiment: •Design your gene •Design your insert •Pick your enzymes •Check your design •Recheck your design Functional construct pDRAW32 Plasmid maps: pDRAW32 All of the important information in one place! ClaI - 6067 - AT'CG_AT - dam methylated! SgrAI - 6023 - Cr'CCGG_yG AfeI - 5941 - AGC'GCT SphI - 5875 - G_CATG'C EcoNI - 5810 - CCTnn'n_nnAGG PfoI - 5774 - T'CCnGG_A PflMI - 5767 - CCAn_nnn'nTGG BstAPI - 5666 - GCAn_nnn'nTGC MluI - 5342 - A'CGCG_T BclI - 5328 - T'GATC_A - dam methylated! la BstEII - 5160 - G'GTnAC_C ApaI - 5139 - G_GGCC'C PspOMI - 5135 - G'GGCC_C cI BssHII - 4931 - G'CGCG_C HpaI - 4840 - GTT'AAC HincII - 4840 - GTy'rAC BsmBI - 4727 - CGTCTCn'nnnn_ BsaXI - 4704 - GGAGnnnnnGTnnnnnnnnn_nnn' BsaXI - 4674 - ACnnnnnCTCCnnnnnnn_nnn' XbaI - 30 - T'CTAG_A NcoI - 69 - C'CATG_G BamHI - 106 - G'GATC_C EcoRI - 112 - G'AATT_C NcoI - 281 - C'CATG_G MscI - 286 - TGG'CCA NdeI - 345 - CA'TA_TG BsrGI - 391 - T'GTAC_A MluI - 437 - A'CGCG_T BaeI - 456 - ACnnnnGTAyCnnnnnnn_nnnnn' BaeI - 489 - GrTACnnnnGTnnnnnnnnnn_nnnnn' PspXI - 536 - vC'TCGA_Gb AvaI - 536 - C'yCGr_G XhoI - 536 - C'TCGA_G BstZ17I - 565 - GTA'TAC BamHI - 635 - G'GATC_C MfeI - 675 - C'AATT_G SalI - 718 - G'TCGA_C HincII - 720 - GTy'rAC BstBI - 737 - TT'CG_AA EcoICRI - 820 - GAG'CTC SacI - 822 - G_AGCT'C NotI - 834 - GC'GGCC_GC AflII - 847 - C'TTAA_G BsrGI - 874 - T'GTAC_A NdeI - 982 - CA'TA_TG BglII - 989 - A'GATC_T MfeI - 995 - C'AATT_G EcoRV - 1003 - GAT'ATC FseI - 1012 - GG_CCGG'CC BaeI - 1016 - GrTACnnnnGTnnnnnnnnnn_nnnnn' AsiSI - 1021 - GCG_AT'CGC PvuI - 1021 - CG_AT'CG ZraI - 1028 - GAC'GTC AatII - 1030 - G_ACGT'C Acc65I - 1032 - G'GTAC_C KpnI - 1036 - G_GTAC'C PspXI - 1038 - vC'TCGA_Gb AvaI - 1038 - C'yCGr_G XhoI - 1038 - C'TCGA_G BaeI - 1049 - ACnnnnGTAyCnnnnnnn_nnnnn' PflMI - 1085 - CCAn_nnn'nTGG PacI - 1113 - TTA_AT'TAA AvrII - 1117 - C'CTAG_G BlpI - 1135 - GC'TnA_GC BsaAI - 1460 - yAC'GTr DraIII - 1463 - CAC_nnn'GTG AloI - 1499 - GAACnnnnnnTCCnnnnnnn_nnnnn' BsaXI - 1499 - ACnnnnnCTCCnnnnnnn_nnn' BsaXI - 1529 - GGAGnnnnnGTnnnnnnnnn_nnn' AloI - 1531 - GGAnnnnnnGTTCnnnnnnn_nnnnn' PsiI - 1588 - TTA'TAA SspI - 1668 - AAT'ATT AhdI - 1873 - GACnn_n'nnGTC BsaI - 1934 - GGTCTCn'nnnn_ BglI - 1993 - GCCn_nnn'nGGC FspI - 2095 - TGC'GCA PvuI - 2243 - CG_AT'CG ScaI - 2353 - AGT'ACT XmnI - 2472 - GAAnn'nnTTC BssSI - 2537 - C'ACGA_G GFP uv pETDuet GFPuv 6104 bp AlfI - 4513 - GCAnnnnnnTGCnnnnnnnnnn_nn' PpuMI - 4473 - rG'GwC_Cy Bpu10I - 4373 - CC'TnA_GC AlfI - 4292 - GCAnnnnnnTGCnnnnnnnnnn_nn' P AfeI - 4228 - AGC'GCT M A XmnI - 3924 - GAAnn'nnTTC BsmBI - 3837 - CGTCTCn'nnnn_ PfoI - 3835 - T'CCnGG_A Tth111I - 3736 - GACn'n_nGTC BsaAI - 3730 - yAC'GTr BstZ17I - 3711 - GTA'TAC SapI - 3595 - GCTCTTCn'nnn_ PciI - 3478 - A'CATG_T BssSI - 3305 - C'ACGA_G AlwNI - 3069 - CAG_nnn'CTG pDRAW32 You can look at the sequence in detail •Open reading frames •Translation •Restriction sites •Complementary strand Design of the Gene If you are cloning out of a known plasmid, just use the sequence that you have Example, the gene we want: G C D R A S P Y C G We got this from phage display: ggctgcgacagggcgagcccgtactgcggt G C D R A S P Y C G Phage sequence Final sequence for the gene of interest: ggctgcgacagggcgagcccgtactgcggttaa G C D R A S P Y C G * Add a stop codon Design of the Gene •If you are designing the gene from scratch, keep in mind codon usage •Not all codons are created equal •Un-optimized codons could lead to lower expression levels •The codon usage reflects levels of tRNA available in E. Coli •Pay attention to the stop codons too (XL1-Blues read through TAG {amber stop codon} 20% of the time) What if we don’t have the DNA sequence? Design from scratch! (don’t forget about codon usage) E. Coli Codon Usage UUU F 0.59 UUC F 0.41 UUA L 0.15 UUG L 0.13 UCU UCC UCA UCG S S S S 0.17 0.15 0.15 0.13 UAU UAC UAA UAG Y Y * * 0.6 0.4 0.6 0.09 UGU UGC UGA UGG C C * W 0.47 0.53 0.31 1 CUU CUC CUA CUG L L L L 0.12 0.1 0.04 0.46 CCU CCC CCA CCG P P P P 0.19 0.13 0.21 0.47 CAU CAC CAA CAG H H Q Q 0.58 0.42 0.34 0.66 CGU CGC CGA CGG R R R R 0.35 0.34 0.07 0.12 AUU AUC AUA AUG I I I M 0.49 0.38 0.13 1 ACU ACC ACA ACG T T T T 0.19 0.38 0.19 0.24 AAU AAC AAA AAG N N K K 0.51 0.49 0.73 0.27 AGU AGC AGA AGG S S R R 0.16 0.23 0.08 0.05 GUU GUC GUA GUG V V V V 0.29 0.2 0.17 0.34 GCU GCC GCA GCG A A A A 0.19 0.26 0.24 0.31 GAU GAC GAA GAG D D E E 0.63 0.37 0.67 0.33 GGU GGC GGA GGG G G G G 0.34 0.36 0.14 0.16 or preferably… http://www.bioinformatics.org/sms2/rev_trans.html http://www.entelechon.com/index.php?id=tools/backtranslation&lang=eng Choice of Restriction Sites/Enzymes Once you have your gene, you need to design a way to get it into your plasmid •Endonucleases (or restriction enzymes) are enzymes which cut DNA at specific internal recognition sequences •Compare to exonucleases, which cut from one end •You must choose restriction sites that are available in the plasmid you are cloning into •They must not appear in your gene (silent mutation can remove unwanted sites in your designed gene) Really Important Factors to Remember When Choosing Restriction Enzymes •Restriction sites must exist only once in your plasmid •They must be in the correct position relative to the purification tag •Restrictions sites usually add extra residues to your gene product; make sure they are compatible with your peptide/protein •Some restriction sites are sub-optimal for cloning •Blunt end sites •dam and dcm methylation-affected enzymes Blunt vs Sticky Ends GATCCGGGCTGCAAGCGGTTAAG Digestion GTCGACG AATTCTTAACCGCTTCCAGCCCG GATCCTGGCT + AGCCAG AATTCGTCGAC GTCGACG AGCCAG GATCCGGGCTGCAAGCGGTTAAG AATTCGTCGAC “sticky ends” AATTCTTAACCGCTTCCAGCCCG Most common restriction enzymes GATCCTGGCT •“Sticky ends”: 5’ or 3’ over-hangs that allow the DNA to anneal even though it is not covalently bound •Help with the next step: ligation Digestion ATCGGGCTGCAAGCGGTTAACAG CTGTTAACCGCTTCCAGCCCGAT GTCGACCAG CTGTTAACCGCTTCCAGCCCGAT ATCTGGCT ATCTGGCT Blunt-end restriction enzymes AGCCAGAT + CTGGTCGAC GTCGACCAG AGCCAGAT ATCGGGCTGCAAGCGGTTAACAG CTGGTCGAC No sticky ends dam Methylation O O O P O O P O O N O NH2 N N Dam methylase N O O P O O O N O H N N N Me N O O P O O •Dam methylase puts a methyl group on the nitrogen of 6th position of adenosine at the site: GATC •All of the E. Coli that we use generate DNA with dam methylation •Some enzymes only cut dam methylated DNA: eg DpnI •Some enzymes do not cut dam methylated DNA: eg XbaI http://www.neb.com/nebecomm/tech_reference/restriction_enzymes/dam_dcm_methylases_of_ecoli.asp dcm Methylation O O O P O O P O O O O NH2 N Dcm methylase Me O O N O O O P O O P O O O NH2 N N O •Dcm methylase puts a methyl group on the carbon of 5th position of cytidine at the site: CCAGG and CCTGG •The enzyme we use most that can be affected by dcm methylation is SfiI •XL1-Blues and BL21s are both Dcm+ http://www.neb.com/nebecomm/tech_reference/restriction_enzymes/dam_dcm_methylases_of_ecoli.asp Design of the Insert •Once you have your restriction enzymes chosen, it is time to design the final complete gene •The multiple cloning site (or whatever plasmid you are cloning into) should already have the 5’ portion of the gene intact (i.e. RBS, spacer, Met) • Sequences must be in frame NcoI BtgI 51 CTTTAATAAG GAGATATACC ATGGGCAGCA GCCATCACCA TCATCACCAC M G S S H H H H H H SacI AscI SbfI SalI NotI BamHI EcoRI EcoICRI BssHII PstI AccI HindIII 101AGCCAGGATC CGAATTCGAG CTCGGCGCGC CTGCAGGTCG ACAAGCTTGC S Q D P N S S S A R L Q V D K L A Design of the Insert Multiple cloning 71 ATGGGCAGCAGCCATCACCATCATCACCAC M G S S H H H H H H SacI AscI SbfI SalI BamHI EcoRI EcoICRI PstI AccI HindIII 101AGCCAGGATCCGAATTCGAGCTCGGCGCGCCTGCAGGTCGACAAGCTTGC S Q D P N S S S A R L Q V D K L A The gene we want: ggctgcgacagggcgagcccgtactgcggttaa G C D R A S P Y C G * site Be aware of the amber stop codon: TAG BamHI PstI AGCCAGGATCCGAATTCGAGCTCGGCGCGCCTGCAGGTCGACAAGCTTGC S Q D P N S S S A R L Q V D K L A G C D R A S P Y C G * ggctgcgacagggcgagcccgtactgcggttaa AGCCAGGATCCGggctgcgacagggcgagcccgtactgcggttaaCTGCAGGTCGACAA Design of the Insert Always check and re-check your sequence! ATGGGCAGCA GCCATCACCA TCATCACCAC AGCCAGGATCCGggctgcgacagggcgagcccgtactgcggttaaCTGCAGGTCGACAA Translate the whole gene atgggcagcagccatcaccatcatcaccacagccaggatccgggctgcgacagggcgagc M G S S H H H H H H S Q D P G C D R A S ccgtactgcggttaactgcaggtcgacaa P Y C G - L Q V D Everything looks good: in frame the whole way! Design of the Insert The wrong way to do it: AGCCAGGATCC ggctgcgacagggcgagcccgtactgcggttaaCTGCAGGTCGACAAGCTT The gene is just inserted after the restriction site, which is out of frame with the plasmid-encoded start-codon/His-tag atgggcagcagccatcaccatcatcaccacagccaggatccggctgcgacagggcgagcc M G S S H H H H H H S Q D P A A T G R A cgtactgcggttaactgcaggtcgacaagctt R T A V N C R S T S Frame shifted = garbage! **Some plasmids, for whatever reason, have restriction sites out of frame with the translated gene** Finishing Touches •Restriction enzymes need 5’ and 3’ base pairs to cut properly •NEB has a reference guide for specific enzymes (see link below) •A good rule of thumb is 6 base pairs after the recognition site •Inserting a GC “clamp” at the end and beginning of the sequence is also a good idea atgggcagcagccatcaccatcatcaccacagccaggatccgggctgcgacagggcgagc M G S S H H H H H H S Q D P G C D R A S ccgtactgcggttaactgcaggtcgacaa P Y C G - L Q V D Final gene, polished and ready to go: gccagccaggatccgggctgcgacagggcgagcccgtactgcggttaactgcaggtcgacgc S Q D P G C D R A S P Y C G - L Q V D http://www.neb.com/nebecomm/tech_reference/restriction_enzymes/cleavage_linearized_vector.asp Design of the Primers Once the insert is designed correctly, the next step is designing primers to order from IDT, based on insert synthesis strategy Three main strategies towards insert synthesis: •PCR amplification •Klenow extension of overlapping primers •Complimentary full-length primers + Insert Vector PCR Amplification of Insert from an Existing Gene The most common method of insert synthesis •Necessitates a pre-existing construct •Extra restriction sites and/or amino acid residues can be added on each side of the gene •Internal mutations are more difficult Insert PCR Synthesis of Insert PCR amplification from overlapping primers •No pre-existing construct is needed •PCR products messy, possibly making subsequent rxns difficult •Good for inserts >150 bp F2: 1x 3’ R1: 1x 3’ 5’ R2: 10x 5’ 3’ 5’ 3’ 5’ F1: 10x Insert Full-length insert should still be the major product Klenow Extension of Overlapping Primers •Two primers that are complimentary in their 3’ region are designed (overlap 15bp) •Extended to full length by the Klenow fragment of DNA Polymerase I •Useful if insert is 50 to 150 bp 3’ 5’ 5’ 3’ 3’ 5’ 5’ Insert 3’ Klenow Klenow fragment: retains 3’ to 5’ polymerase activity, but does not have exonuclease activity Complimentary Full-Length Primers •The simplest approach •Order two primers that compliment each other •Mix the two primers, heat, and aneal slowly (to ensure proper base-pairing) •Feasible if the total insert size is < 60 bp 3’ 5’ 5’ Anneal Insert 3’ Designing Primers to Order Once the insert synthesis technique is decided, primer design is fairly straight-forward Forward primers: •Assess necessary overlap and copy the sequence from your designed gene, along with extra 5’ sequence Reverse primers: •First, design exactly as if it were a forward primer: Copy necessary overlap and extra 3’ sequence from your designed gene •Once all this is in place, use pDRAW32 sequence manipulator to calculate the reverse compliment •Order the pDRAW32 calculated sequence directly Cloning Out an Existing Gene In the example mentioned previously, we would normally use full length overlapping primers, but let’s look at the more common case of having a preexisting gene: Preexisting gene: Overlap tgcggcccagccggccatgggctgcgacagggcgagcccgtactgcggtggaggcggtgctgcagcgc A A Q P A M G C D R A S P Y C G G G G A A A + Goal gene: gccagccaggatccgggctgcgacagggcgagcccgtactgcggttaactgcaggtcgacgc S Q D P G C D R A S P Y C G - L Q V D Extra sequence from gene design Forward Primer: gccagccaggatccgggctgcgacagg Design of Reverse Primer: ccgtactgcggttaactgcaggtcgacgc Ordering Primers gccagccaggatccgggctgcgacagggcgagcccgtactgcggttaactgcaggtcgacgc S Q D P G C D R A S P Y C G - L Q V D Forward primer to order: gccagccaggatccgggctgcgacagg & Design of Reverse Primer: ccgtactgcggttaactgcaggtcgacgc Reverse primer to order: GCGTCGACCTGCAGTTAACCGCAGTACGG Now we can order the primers: http://www.idtdna.com/Home/Home.aspx Vectors and Bacteria Strains An important thing to think about before you start cloning: What vectors/E Coli should I use? Vector pQE-30 pMAL Promoter T5 promoter Ptac promoter E Coli strains we use XL1-Blue: mostly good for DNA isolation/phage display M15(pREP4): tighter regulation of the lac suppressor pCANTAB-5E Plac promoter pET-Duet pRSF-Duet T7 lac promoter BL-21: Protease deficient, stable to toxic proteins, and contains (An E. Coli strain with phage T7 RNA polymerase the T7 RNA polymerase gene is necessary) lac Expression Regulation RNA polymerase Promoter lac site X RBS IPTG (or lactose, etc) lac repressor Promoter lac site RBS mRNA ATG- your gene Transcription IPTG Promoter lac site ATG- your gene RBS ATG- your gene Purification Tags and Selection (Anti-biotic Resistance) •Anti-biotic resistance (working concentration) •Ampicillin (100g/mL) •Kanamycin (35g/mL) •Tetracycline HCl (10g/mL) •Chloramphenicol (170g/mL in ethanol) •Purification Tag •His-tag (nickel agarose resin) •Maltose Binding Protein (amylose resin) •Glutathione S-Transferase (glutathione resin) Digestion of Insert and Vector •Digest with the same restriction endonucleases •Optional (recommended) step: •Treat the plasmid DNA with Antarctic phosphatase •Decreases the background by stopping self-ligation of singly cut plasmid and background re-ligation Ligation of the Insert into the Vector + •Ligation covalently attaches the vector and the insert via a phosphodiester bond (5’phosphate and 3’ hydroxyl of the next base) Antarctic Phosphatase and Ligation O O P O O O O P O O R1 O O R1 HO O + O O P O O P O O O O O R2 O O O P O O P O O O R2 •Antarctic Phosphatase cleaves this phosphate, disallowing self-ligation •The insert still has the 5’ phosphate though http://www.neb.com/nebecomm/products/productM0202.asp Transformation •The functional construct is now ready to be transformed into new E. Coli and grown up •The new DNA isolated from the E. Coli must then be sequenced to make sure that everything worked •Once the sequence is confirmed, we are ready to go!