Cloning 101: A Primer - University of Arizona

advertisement
Ghosh Lab
University of Arizona
Department of Chemistry
Outline
•Cloning overview
•pDRAW32
• Design
•Gene
•Insert
•Primers
•Further considerations
(optimization of the process)
•Transformation
Cloning Overview
Four main steps in cloning:
•Insert synthesis
•Restriction enzyme digest
•Ligation
•Transformation
+
Plasmid
(vector)
Insert
(your gene)
Functional
construct
Design Overview
Steps to follow in designing your cloning experiment:
•Design your gene
•Design your insert
•Pick your enzymes
•Check your design
•Recheck your design
Functional
construct
pDRAW32
Plasmid maps: pDRAW32
All of the important information in one place!
ClaI - 6067 - AT'CG_AT - dam methylated!
SgrAI - 6023 - Cr'CCGG_yG
AfeI - 5941 - AGC'GCT
SphI - 5875 - G_CATG'C
EcoNI - 5810 - CCTnn'n_nnAGG
PfoI - 5774 - T'CCnGG_A
PflMI - 5767 - CCAn_nnn'nTGG
BstAPI - 5666 - GCAn_nnn'nTGC
MluI - 5342 - A'CGCG_T
BclI - 5328 - T'GATC_A - dam methylated!
la
BstEII - 5160 - G'GTnAC_C
ApaI - 5139 - G_GGCC'C
PspOMI - 5135 - G'GGCC_C
cI
BssHII - 4931 - G'CGCG_C
HpaI - 4840 - GTT'AAC
HincII - 4840 - GTy'rAC
BsmBI - 4727 - CGTCTCn'nnnn_
BsaXI - 4704 - GGAGnnnnnGTnnnnnnnnn_nnn'
BsaXI - 4674 - ACnnnnnCTCCnnnnnnn_nnn'
XbaI - 30 - T'CTAG_A
NcoI - 69 - C'CATG_G
BamHI - 106 - G'GATC_C
EcoRI - 112 - G'AATT_C
NcoI - 281 - C'CATG_G
MscI - 286 - TGG'CCA
NdeI - 345 - CA'TA_TG
BsrGI - 391 - T'GTAC_A
MluI - 437 - A'CGCG_T
BaeI - 456 - ACnnnnGTAyCnnnnnnn_nnnnn'
BaeI - 489 - GrTACnnnnGTnnnnnnnnnn_nnnnn'
PspXI - 536 - vC'TCGA_Gb
AvaI - 536 - C'yCGr_G
XhoI - 536 - C'TCGA_G
BstZ17I - 565 - GTA'TAC
BamHI - 635 - G'GATC_C
MfeI - 675 - C'AATT_G
SalI - 718 - G'TCGA_C
HincII - 720 - GTy'rAC
BstBI - 737 - TT'CG_AA
EcoICRI - 820 - GAG'CTC
SacI - 822 - G_AGCT'C
NotI - 834 - GC'GGCC_GC
AflII - 847 - C'TTAA_G
BsrGI - 874 - T'GTAC_A
NdeI - 982 - CA'TA_TG
BglII - 989 - A'GATC_T
MfeI - 995 - C'AATT_G
EcoRV - 1003 - GAT'ATC
FseI - 1012 - GG_CCGG'CC
BaeI - 1016 - GrTACnnnnGTnnnnnnnnnn_nnnnn'
AsiSI - 1021 - GCG_AT'CGC
PvuI - 1021 - CG_AT'CG
ZraI - 1028 - GAC'GTC
AatII - 1030 - G_ACGT'C
Acc65I - 1032 - G'GTAC_C
KpnI - 1036 - G_GTAC'C
PspXI - 1038 - vC'TCGA_Gb
AvaI - 1038 - C'yCGr_G
XhoI - 1038 - C'TCGA_G
BaeI - 1049 - ACnnnnGTAyCnnnnnnn_nnnnn'
PflMI - 1085 - CCAn_nnn'nTGG
PacI - 1113 - TTA_AT'TAA
AvrII - 1117 - C'CTAG_G
BlpI - 1135 - GC'TnA_GC
BsaAI - 1460 - yAC'GTr
DraIII - 1463 - CAC_nnn'GTG
AloI - 1499 - GAACnnnnnnTCCnnnnnnn_nnnnn'
BsaXI - 1499 - ACnnnnnCTCCnnnnnnn_nnn'
BsaXI - 1529 - GGAGnnnnnGTnnnnnnnnn_nnn'
AloI - 1531 - GGAnnnnnnGTTCnnnnnnn_nnnnn'
PsiI - 1588 - TTA'TAA
SspI - 1668 - AAT'ATT
AhdI - 1873 - GACnn_n'nnGTC
BsaI - 1934 - GGTCTCn'nnnn_
BglI - 1993 - GCCn_nnn'nGGC
FspI - 2095 - TGC'GCA
PvuI - 2243 - CG_AT'CG
ScaI - 2353 - AGT'ACT
XmnI - 2472 - GAAnn'nnTTC
BssSI - 2537 - C'ACGA_G
GFP
uv
pETDuet GFPuv
6104 bp
AlfI - 4513 - GCAnnnnnnTGCnnnnnnnnnn_nn'
PpuMI - 4473 - rG'GwC_Cy
Bpu10I - 4373 - CC'TnA_GC
AlfI - 4292 - GCAnnnnnnTGCnnnnnnnnnn_nn'
P
AfeI - 4228 - AGC'GCT
M
A
XmnI - 3924 - GAAnn'nnTTC
BsmBI - 3837 - CGTCTCn'nnnn_
PfoI - 3835 - T'CCnGG_A
Tth111I - 3736 - GACn'n_nGTC
BsaAI - 3730 - yAC'GTr
BstZ17I - 3711 - GTA'TAC
SapI - 3595 - GCTCTTCn'nnn_
PciI - 3478 - A'CATG_T
BssSI - 3305 - C'ACGA_G
AlwNI - 3069 - CAG_nnn'CTG
pDRAW32
You can look at the
sequence in detail
•Open reading frames
•Translation
•Restriction sites
•Complementary strand
Design of the Gene
If you are cloning out of a known plasmid, just use
the sequence that you have
Example, the gene we want:
G C D R A S P Y C
G
We got this from phage display:
ggctgcgacagggcgagcccgtactgcggt
G C D R A S P Y C G
Phage sequence
Final sequence for the gene of interest:
ggctgcgacagggcgagcccgtactgcggttaa
G C D R A S P Y C G *
Add a stop codon
Design of the Gene
•If you are designing the gene from scratch, keep in
mind codon usage
•Not all codons are created equal
•Un-optimized codons could lead to lower
expression levels
•The codon usage reflects levels of tRNA available
in E. Coli
•Pay attention to the stop codons too (XL1-Blues read
through TAG {amber stop codon} 20% of the time)
What if we don’t have the DNA sequence?
Design from scratch! (don’t forget about codon usage)
E. Coli Codon Usage
UUU F 0.59
UUC F 0.41
UUA L 0.15
UUG L 0.13
UCU
UCC
UCA
UCG
S
S
S
S
0.17
0.15
0.15
0.13
UAU
UAC
UAA
UAG
Y
Y
*
*
0.6
0.4
0.6
0.09
UGU
UGC
UGA
UGG
C
C
*
W
0.47
0.53
0.31
1
CUU
CUC
CUA
CUG
L
L
L
L
0.12
0.1
0.04
0.46
CCU
CCC
CCA
CCG
P
P
P
P
0.19
0.13
0.21
0.47
CAU
CAC
CAA
CAG
H
H
Q
Q
0.58
0.42
0.34
0.66
CGU
CGC
CGA
CGG
R
R
R
R
0.35
0.34
0.07
0.12
AUU
AUC
AUA
AUG
I
I
I
M
0.49
0.38
0.13
1
ACU
ACC
ACA
ACG
T
T
T
T
0.19
0.38
0.19
0.24
AAU
AAC
AAA
AAG
N
N
K
K
0.51
0.49
0.73
0.27
AGU
AGC
AGA
AGG
S
S
R
R
0.16
0.23
0.08
0.05
GUU
GUC
GUA
GUG
V
V
V
V
0.29
0.2
0.17
0.34
GCU
GCC
GCA
GCG
A
A
A
A
0.19
0.26
0.24
0.31
GAU
GAC
GAA
GAG
D
D
E
E
0.63
0.37
0.67
0.33
GGU
GGC
GGA
GGG
G
G
G
G
0.34
0.36
0.14
0.16
or preferably…
http://www.bioinformatics.org/sms2/rev_trans.html
http://www.entelechon.com/index.php?id=tools/backtranslation&lang=eng
Choice of Restriction Sites/Enzymes
Once you have your gene, you need to design
a way to get it into your plasmid
•Endonucleases (or restriction enzymes) are
enzymes which cut DNA at specific internal
recognition sequences
•Compare to exonucleases, which cut from one end
•You must choose restriction sites that are available
in the plasmid you are cloning into
•They must not appear in your gene (silent mutation
can remove unwanted sites in your designed gene)
Really Important Factors to Remember
When Choosing Restriction Enzymes
•Restriction sites must exist only once in your
plasmid
•They must be in the correct position relative to
the purification tag
•Restrictions sites usually add extra residues to
your gene product; make sure they are compatible
with your peptide/protein
•Some restriction sites are sub-optimal for cloning
•Blunt end sites
•dam and dcm methylation-affected enzymes
Blunt vs Sticky Ends
GATCCGGGCTGCAAGCGGTTAAG
Digestion
GTCGACG AATTCTTAACCGCTTCCAGCCCG GATCCTGGCT
+
AGCCAG
AATTCGTCGAC
GTCGACG
AGCCAG GATCCGGGCTGCAAGCGGTTAAG AATTCGTCGAC
“sticky ends”
AATTCTTAACCGCTTCCAGCCCG
Most common
restriction enzymes
GATCCTGGCT
•“Sticky ends”: 5’ or 3’ over-hangs that allow the DNA to
anneal even though it is not covalently bound
•Help with the next step: ligation
Digestion
ATCGGGCTGCAAGCGGTTAACAG
CTGTTAACCGCTTCCAGCCCGAT
GTCGACCAG CTGTTAACCGCTTCCAGCCCGAT ATCTGGCT
ATCTGGCT
Blunt-end restriction
enzymes
AGCCAGAT
+
CTGGTCGAC
GTCGACCAG
AGCCAGAT ATCGGGCTGCAAGCGGTTAACAG CTGGTCGAC
No sticky ends
dam Methylation
O
O
O P O
O P O
O
N
O
NH2
N
N
Dam methylase
N
O
O P O
O
O
N
O
H
N
N
N
Me
N
O
O P O
O
•Dam methylase puts a methyl group on the nitrogen of 6th
position of adenosine at the site: GATC
•All of the E. Coli that we use generate DNA with dam
methylation
•Some enzymes only cut dam methylated DNA: eg DpnI
•Some enzymes do not cut dam methylated DNA: eg XbaI
http://www.neb.com/nebecomm/tech_reference/restriction_enzymes/dam_dcm_methylases_of_ecoli.asp
dcm Methylation
O
O
O P O
O P O
O
O
O
NH2
N
Dcm methylase
Me
O
O
N
O
O
O P O
O P O
O
O
NH2
N
N
O
•Dcm methylase puts a methyl group on the carbon of 5th
position of cytidine at the site: CCAGG and CCTGG
•The enzyme we use most that can be affected by dcm
methylation is SfiI
•XL1-Blues and BL21s are both Dcm+
http://www.neb.com/nebecomm/tech_reference/restriction_enzymes/dam_dcm_methylases_of_ecoli.asp
Design of the Insert
•Once you have your restriction enzymes chosen, it is time
to design the final complete gene
•The multiple cloning site (or whatever plasmid you are
cloning into) should already have the 5’ portion of the gene
intact (i.e. RBS, spacer, Met)
• Sequences must be in frame
NcoI
BtgI
51 CTTTAATAAG GAGATATACC ATGGGCAGCA GCCATCACCA TCATCACCAC
M G S S
H H H
H H H
SacI
AscI SbfI
SalI
NotI
BamHI EcoRI EcoICRI BssHII PstI AccI
HindIII
101AGCCAGGATC CGAATTCGAG CTCGGCGCGC CTGCAGGTCG ACAAGCTTGC
S Q D P
N S S
S A R
L Q V D
K L A
Design of the Insert
Multiple cloning
71 ATGGGCAGCAGCCATCACCATCATCACCAC
M G S S H H H H H H
SacI AscI SbfI
SalI
BamHI EcoRI EcoICRI
PstI AccI HindIII
101AGCCAGGATCCGAATTCGAGCTCGGCGCGCCTGCAGGTCGACAAGCTTGC
S Q D P N S S S A R L Q V D K L A
The gene we want:
ggctgcgacagggcgagcccgtactgcggttaa
G C D R A S P Y C G *
site
Be aware of the amber
stop codon: TAG
BamHI
PstI
AGCCAGGATCCGAATTCGAGCTCGGCGCGCCTGCAGGTCGACAAGCTTGC
S Q D P N S S S A R L Q V D K L A
G C D R A S P Y C G *
ggctgcgacagggcgagcccgtactgcggttaa
AGCCAGGATCCGggctgcgacagggcgagcccgtactgcggttaaCTGCAGGTCGACAA
Design of the Insert
Always check and re-check your sequence!
ATGGGCAGCA GCCATCACCA TCATCACCAC
AGCCAGGATCCGggctgcgacagggcgagcccgtactgcggttaaCTGCAGGTCGACAA
Translate the whole gene
atgggcagcagccatcaccatcatcaccacagccaggatccgggctgcgacagggcgagc
M G S S H H H H H H S Q D P G C D R A S
ccgtactgcggttaactgcaggtcgacaa
P Y C G - L Q V D
Everything looks good: in frame the whole way!
Design of the Insert
The wrong way to do it:
AGCCAGGATCC ggctgcgacagggcgagcccgtactgcggttaaCTGCAGGTCGACAAGCTT
The gene is just inserted after the restriction site, which is
out of frame with the plasmid-encoded start-codon/His-tag
atgggcagcagccatcaccatcatcaccacagccaggatccggctgcgacagggcgagcc
M G S S H H H H H H S Q D P A A T G R A
cgtactgcggttaactgcaggtcgacaagctt
R T A V N C R S T S
Frame shifted = garbage!
**Some plasmids, for whatever reason, have restriction
sites out of frame with the translated gene**
Finishing Touches
•Restriction enzymes need 5’ and 3’ base pairs to cut properly
•NEB has a reference guide for specific enzymes (see link below)
•A good rule of thumb is 6 base pairs after the recognition site
•Inserting a GC “clamp” at the end and beginning of the
sequence is also a good idea
atgggcagcagccatcaccatcatcaccacagccaggatccgggctgcgacagggcgagc
M G S S H H H H H H S Q D P G C D R A S
ccgtactgcggttaactgcaggtcgacaa
P Y C G - L Q V D
Final gene, polished and ready to go:
gccagccaggatccgggctgcgacagggcgagcccgtactgcggttaactgcaggtcgacgc
S Q D P G C D R A S P Y C G - L Q V D
http://www.neb.com/nebecomm/tech_reference/restriction_enzymes/cleavage_linearized_vector.asp
Design of the Primers
Once the insert is designed correctly, the next step is designing
primers to order from IDT, based on insert synthesis strategy
Three main strategies towards insert synthesis:
•PCR amplification
•Klenow extension of overlapping primers
•Complimentary full-length primers
+
Insert
Vector
PCR Amplification of Insert from
an Existing Gene
The most common method of insert synthesis
•Necessitates a pre-existing construct
•Extra restriction sites and/or amino acid
residues can be added on each side of the gene
•Internal mutations are more difficult
Insert
PCR Synthesis of Insert
PCR amplification from overlapping primers
•No pre-existing construct is needed
•PCR products messy, possibly making subsequent rxns difficult
•Good for inserts >150 bp
F2: 1x
3’
R1: 1x
3’
5’
R2: 10x
5’
3’
5’
3’
5’
F1: 10x
Insert
Full-length insert should still be the major product
Klenow Extension of Overlapping
Primers
•Two primers that are complimentary in their 3’
region are designed (overlap  15bp)
•Extended to full length by the Klenow fragment
of DNA Polymerase I
•Useful if insert is 50 to 150 bp
3’
5’
5’
3’
3’
5’
5’
Insert
3’
Klenow
Klenow fragment: retains 3’ to 5’ polymerase
activity, but does not have exonuclease activity
Complimentary Full-Length Primers
•The simplest approach
•Order two primers that compliment each other
•Mix the two primers, heat, and aneal slowly (to
ensure proper base-pairing)
•Feasible if the total insert size is < 60 bp
3’
5’
5’
Anneal
Insert
3’
Designing Primers to Order
Once the insert synthesis technique is decided,
primer design is fairly straight-forward
Forward primers:
•Assess necessary overlap and copy the sequence from your
designed gene, along with extra 5’ sequence
Reverse primers:
•First, design exactly as if it were a forward primer:
Copy necessary overlap and extra 3’ sequence from your
designed gene
•Once all this is in place, use pDRAW32 sequence manipulator
to calculate the reverse compliment
•Order the pDRAW32 calculated sequence directly
Cloning Out an Existing Gene
In the example mentioned previously, we would normally use
full length overlapping primers, but let’s look at the more
common case of having a preexisting gene:
Preexisting gene:
Overlap
tgcggcccagccggccatgggctgcgacagggcgagcccgtactgcggtggaggcggtgctgcagcgc
A A Q P A M G C D R A S P Y C G G G G A A A
+
Goal gene:
gccagccaggatccgggctgcgacagggcgagcccgtactgcggttaactgcaggtcgacgc
S Q D P G C D R A S P Y C G - L Q V D
Extra sequence from gene design
Forward Primer:
gccagccaggatccgggctgcgacagg
Design of Reverse Primer:
ccgtactgcggttaactgcaggtcgacgc
Ordering Primers
gccagccaggatccgggctgcgacagggcgagcccgtactgcggttaactgcaggtcgacgc
S Q D P G C D R A S P Y C G - L Q V D
Forward primer to order:
gccagccaggatccgggctgcgacagg
&
Design of Reverse Primer:
ccgtactgcggttaactgcaggtcgacgc
Reverse primer to order:
GCGTCGACCTGCAGTTAACCGCAGTACGG
Now we can order the primers:
http://www.idtdna.com/Home/Home.aspx
Vectors and Bacteria Strains
An important thing to think about before you start cloning:
What vectors/E Coli should I use?
Vector
pQE-30
pMAL
Promoter
T5 promoter
Ptac promoter
E Coli strains we use
XL1-Blue: mostly good for DNA
isolation/phage display
M15(pREP4): tighter regulation
of the lac suppressor
pCANTAB-5E Plac promoter
pET-Duet
pRSF-Duet
T7 lac promoter
BL-21: Protease deficient, stable
to toxic proteins, and contains
(An E. Coli strain with
phage T7 RNA polymerase the T7 RNA polymerase gene
is necessary)
lac Expression Regulation
RNA polymerase
Promoter lac site
X
RBS
IPTG (or lactose, etc)
lac repressor
Promoter lac site
RBS
mRNA
ATG- your gene
Transcription
IPTG
Promoter lac site
ATG- your gene
RBS
ATG- your gene
Purification Tags and Selection
(Anti-biotic Resistance)
•Anti-biotic resistance (working concentration)
•Ampicillin (100g/mL)
•Kanamycin (35g/mL)
•Tetracycline HCl (10g/mL)
•Chloramphenicol (170g/mL in ethanol)
•Purification Tag
•His-tag (nickel agarose resin)
•Maltose Binding Protein (amylose resin)
•Glutathione S-Transferase (glutathione resin)
Digestion of Insert and Vector
•Digest with the same restriction endonucleases
•Optional (recommended) step:
•Treat the plasmid DNA with Antarctic phosphatase
•Decreases the background by stopping self-ligation
of singly cut plasmid and background re-ligation
Ligation of the Insert into the Vector
+
•Ligation covalently attaches the vector and the
insert via a phosphodiester bond (5’phosphate and 3’
hydroxyl of the next base)
Antarctic Phosphatase and Ligation
O
O P O
O
O
O P O
O
R1
O
O
R1
HO
O
+
O
O P O
O P O
O
O
O
O
R2
O
O
O P O
O P O
O
O
R2
•Antarctic Phosphatase cleaves this phosphate, disallowing self-ligation
•The insert still has the 5’ phosphate though
http://www.neb.com/nebecomm/products/productM0202.asp
Transformation
•The functional construct is now ready to be transformed
into new E. Coli and grown up
•The new DNA isolated from the E. Coli must then be
sequenced to make sure that everything worked
•Once the sequence is confirmed, we are ready to go!
Download