Chapter 9 DNA based Information technologies

advertisement
Chapter 9 DNA based Information technologies
9.1 Studying Genes and their Products
Typical research problem. Have isolated a protein that you want to study
Want to isolate enough so can crystallize for X-ray
Want to alter AA sequence in active site to see what happens
Want to see how enzyme interacts with other proteins in cell
Want to see how protein is regulated in cell
Can do all of the above if can get the DNA for the gene encoding the enzyme
but gene is a few thousand base pairs in a chromosome of 100' of million
base pairs.
This section describes the tools to do the above
Genes can be isolated by DNA cloning
Clone - identical copy of something
Originally applied to a cell that was allowed to reproduce to make a
colony of identical cells
In DNA it refers to making many identical copies of a gene
sequence
The Process of Gene cloning involves 5 steps Figure 9-1
1. Cutting DNA at precise locations
Done by sequence specific endonucleases
2. Selecting a small carrier DNA capable of self- replication
These are called cloning vectors
3. Joining the two DNA fragments covalently
Composite product called recombinant DNA
4. Moving the recombinant DNA for a test tube back into a host cell
for replication
5. Finding and selecting for the host cells that contain the cloning
vector.
The above process called
Recombinant DNA technology
Genetic engineering
Most of discussion will focus on Ecoli methods, since that is the best
understood
2
Restriction Endonucleases and DNA Ligases yield recombinant DNA
Step 1 cutting DNA at specific location
Restriction endonucleases or restriction enzymes
Recognize and cleave DNA at specific sequences
When used on a large piece of DNA makes smaller, well
defined pieces
DNA ligases then link cleaved DNA into cloning vector
Restriction Endonucleases
Found in a wide range of bacteria
Used in bacteria to cleave foreign DNA
Does not cleave its own DNA because it has been methylated at
that site
By specific methylase
Restriction endonuclease and matching methylase called
Restriction-modification system
Three types of restriction endonucleases
Type I
Large multi-subunit protein
Cleaves at random sites that can be more than 1000 bp
from recognition site
Uses ATP energy to move along DNA
Type II
Simpler structure
Does not move on DNA so no need for ATP
Cleaves withing recognitions sequence
This is the one we use for genetic recombination
Type III
Large multi-subunit protein
Cleaves DNA about 25 bp from recognition site
Uses ATP energy to move along DNA
Thousands of type II discovered
>100 different DNA recognition sites
Recognition site usually 4-6 residues and pallindromic
Table 9-2 a small sample
Some make staggered cuts
Leaves 2-4 unpaired bases at each end
Called sticky ends
Can hbobd back to self or to another piece of DNA
Some make blunt end cuts
Average size of DNA fragments produced depends on how often
sequence occurs in DNA
3
This, in turn, depends on size of site
6 bp site
Random chance of 46 or 4096
So size of average fragment should be 1 in
4100
4 bp site
Random chance 44 or 256 bp
So average size about 1 in 250
Not entirely random
Sites occur less often than expected
Can get larger fragments if simply stop reaction before
complete
This is called a partial digest
Can also get larger fragments if use homing endonuclease
Chapter 26!
Recognition sequence 14-20 bp
Once DNA cut can be purified
Agarose or polyacrylamide gel or HPLC
If have cleaved entire chromosome usually too many fragments
So construct a DNA library first (Next major section)
Once purified use ligase to attache to cloning vector that has been
cleaved with same restriction enzyme so sticky end match
(Cloning vector described in next minor section)
Ligase uses ATP to linked DNA strands together
Most efficient with Sticky end restriction endonucleases
But can work with blunt ends as well
Can also link with synthetic DNA containing novel seqeunces
Called linkers
Or even poly linkers
Linkers contain other sequences that will be useful in the cloning
process
Cloning Vectors
Now lets look at the DNA we are going to attach our chosen sequence to
Plasmids
Circular piece of DNA that replicates sperately from bacterial host
chromosome
5,000-400,000 bp
Thought to be molecular parasites
Like a virus DNA that can no longer make a virus capsule
and infect other bacteria
Contain some sequence that allow to reproduce in host bacteria
using the bacteria’s own enzymes to reproduce
Often plasmids are more like symbiotes than parasites
Either confer antibiotic resistance
4
Or confer a new property on bacteria
Ti plasmid in Agrobacterium tumefaciens - allows
bacteria to invade plant cells
Classic Plasmid pBR322 - constructed in 1977
Figure 9-3
In 4,361 bp sequence
Ori - origin of replication
Uses host bacteria’s enzymes to start replicating at
this point
Associated regulatory system limits to 10-20
copies/cell
Genes that confer resistance to
Ampicillin
Tetracycline
Several sites for unique cleavage with restriction enzymes
PstI, EcoRI, BamHI, SalI, PvuII
Small size of plasmid makes it easier to get into host cell
and to manipulate
Other plasmids
Different ori seqeunces give different copy numbers
1 to 1000/cell
If have two plasmids in cell and both use same ori
Will interfere with each other
Said to be incompatable
So if need 2 plasmids at once need a different ori on
each plasmid
Transformation
The process of putting a plasmid into a cell
In many bacteria simple
Put bacteria and plasmid in a CaCl2 solution into test tube at
0C
Rapidly bring to 37C or 43 C
Don’t know why but it works!
Some bacteria just naturally ‘competent’ at DNA uptake
Do not need above treatment
Other cells need Electroporation
Subject cell to high voltage pulse
Allows cell membrane to uptake large DNA
5
Selection
Not all cells will have taken in the plasmid DNA
Now need to identify the cells containing the plasmid
Utilize genes in plasmid called Selectable Markers
Selectable markers - 2 kinds
Allow cell containing plasmid to grow under defined
conditions
Called positive selection
Kill cell containing the plasmid under defined conditions
Negative selection
pBR 322
Figure 9-4 shows how use both
Screenable Markers
Make transformed cell have color or fluoresce
Plasmid has extra gene for this trait
So can visually see the transformed colonies
Transformation less efficient with larger DNA
15,000 largest can do with a plasmid
Bacterial Artificial Chromosomes (BAC’s)
Figure 9-5
Can do for DNA 100,000 bp to 300,000 bp
Approaching size of host chromosome!
Simple plasmid at start
Low copy number in cel (1 or 2)
So do not see recombination events (Next semester)
Also contains selectable and screenable markers
Yeast Artificial Chromosomes (YAC’s)
Figure 9-6
Yeast genetics almost as well understood as e coli
Easy to maintain
Easy to grow on an industrial scale
Eukariotic so will be processed differently
Plasmid vectors developed much like E cole
Some systems with multiple origins for different organisms
Called shuttle vectors
Contains origin for Yeast
Two selectable markers
Telomeres and centromere
Two telomeres (usually at end of eukariotic chromosome)
BAMH1 sites used to remove DNA between telomeres to make
linear
6
Can put in DNA chunks up to 2x106 bp
Stability of YAC increases with size up to limit
100,000 bp or less - slowly lost
150,000 bP stable
Used to study eukariotic chromosome metabolism
Expressing protein product of genes
usually product of gene, not the gene itself that you want to study
either for study or to make commercially
Trying to express eukariotic gene in bacteria has some issues
Sequences needed for transcription and regulation in original host
do not function in bacterial cell
Promoters
Ribosome binding sites etc
So need to add sequences for bacteria control and transription
Cloning vectors that contain all the signals for regulated expression are
called
Expression vectors
Again details on these sequences in second semester
Figure 9-7
Many different systems are used to express recombinant proteins
Lots of different systems. Each has advantages and disadvantages
Bacteria
Best understood and most common
Easy to store and grow in lab
Also media is cheap
Can be grown on industrial scale
But proteins may not be processed correctly
Do not fold right
Do not get proper covalent modifications
May need proteolytic cleavage for activation
Many eukariotic proteins aggregate into insoluble celluar
precipitates
Called inclusion bodies
Always developing new system to get around these problems
Skip next two paragraphs
Yeast
Well understood and characterized
Also easy and cheap to grow on industrial scale
7
Have tough cell walls so harder to get DNA inside
That is why first make the shuttle vector in bacteria
Since is eukariotic system control mechanisms for eukariotic genes
work better
But can still have folding and procession problems
Insect and Insect viruses
Baculoviruses figure 9-9
Insect viruses that inser double stranded DNA into DNA
genomes
Usually act as parasite
Kills host larvae while making more viruses
Late in infection produce large amounts of two proteins for virus
p10 and polyhedrin
Not needed in cultured insect cells
So replace you your gene of interest
Can get up to 25% of protein to be your protein of interest
Baculovirus most common protein expression system
Genome is 134,000 bp
Too large for direct cloning
Also purfying virus is difficult
Use bacmids instead
Large circular DNA
Contains baculovirus DNA
Plus sequences for replication in e coli
You usually start your cloning in a smaller plasmid system
Then join with bacmid to make larger gene
Large number of systems commercially available
Protein modification is better
But still some failures
Mammalian cell Culture
Easiest to introduce foreign DNA with viruses
A variety of engineered viruses are available commercially
Using some viruses can get DNA permanently incorporated into
cell line
Very little problems with process fo final protein
Biggest issue is mammalian cell cultures are expensive to mantain
Alteration of Clones Gene to produce altered proteins
Figure 9-10
Once have a cloned gene being expressed, can then use site-directed
mutagenisis to change sequence of protein
8
Powerful approach to see how individual residues in a protein affect
structure/function
If they exist use restriction sites to remove a piece of DNA
And replace with a new piece with a few small changes
Also can use oligonucleotide-directed mutagenesis
Make DNA with desired base change ne for each strand
If DNA 30-40 residues will anneal to original DNA
Use as PCR (later this chapter) primers to replicate DNA
Process makes many copies of mutated DNA
Fewer copies of original DNA
If original DNA came from wild-type Ecoli will be methylated
Can use DpnI to destroy this DNA!
Can also fuse protein or domains together
or remove and replace domains
Product of a fused gene is called a fusion protein
Terminal Tags as handles
You learned back in chapter 3 that the best way to purify a protein was
with affinity chromotagraphy
Many proteins do not have ligands that can be used for affinity chrom
So as long as we are modifying our protein and producing on an industrial
scale, lets add a tag on either end of the protein that can be purified with
affinity chromatography so purifying the protein is simplified.
Common tags shown in table 9-3
Diagram of process shown in figure 9-11
GST - Glutathione-S- transferase
Small protein (26,000)
Ightly binds to lutathione
Another system
Add 6 or more HIS residues to one end
His binds to Ni2+
Use chromatographis matrix that has bound Ni2+
Gene Sequence amplification PCR
Polymerase chain reaction PCR - conceived by Kary Mullis in 1983
used to amplify the number of copies of a piece of DNA
relies on DNA Polymerase
Need deoxy NTP’s
Needs template
9
Only works in 5'63' direction
Figure 9-12
Start with 2 synthetic pieces of DNA
Complementary to ends of target DNA sequence 1 for each strand
Step 1 Heat DNa to separate strands
Step 2 add primers and cool to anneal
Step 3 add heat stable Taq DNS polymerase + deoxy NTP’s
Will extend from primers
Repeat steps 1-3 25 or 30 times
DNA in question amplified 106 times
Each cycle double amount of DNA
20 times = 220 > 106
30 times 230> 109
Can include additional tricks
Have restriction endo nuclease sites added to ends of primers
If short this segment won’t anneal to native DNA
But as amplifies gets incorporated into the new DNA
Can amplify a single coy of DNA into useful amounts
Used to amplyfy 40,000 year olds DNA from mummy
Also Wooly mammoth DNA
Use in forensic analysis See Box 9-1
Since amplifies any DNA - contamination a serious problem
Variations
Reverse transriptase PCR (RT-PCR)
Start with an RNA
Use a reversetranscriptase for first cycle to make a DNA
Use regular PCR to replicate the DNA
Quantitative PCR (qPCR)
Figure 9-13
Have a probe witch has fluorophore and quencher
With both on 1 molecule - no fluorescnece
Amplify DNA
When probe binds to target
Fluorophoe and quencher separated
Will begin to fluoresce
If target DNA in high amounts will see fluorescence after
fewer cycles of PCR. If target DNA in low amounts will take
more cycles
10
9.2 Using DNA based methods to understand Protein Function
Can describe protein function on 3 levels
1. Phenotypic function
Effect of protein on entire organism
Modify protein and see overall change in organism
2. Cellular function
Describe the network of interaction within gthe cell
3. Molecular function
Precise biochemical activity of protein
Wide variety of DNA based techniques to study at all levels
DNA Libraries
A collection of DNA clones
Used for Ggenome sequencing, gene discovery or determination of
gene/protein function
Genomic Library
Cleave entire genome of an organism in to 1000's of fragments
All fragments cloned by insertion into vectors
Step 1. Partial digestion of genome by restriction endonucleases
Make fragments of a limited size range
Trying to make sure all genes are in library clones
Remove large and small fragments with centrifugation or
electrophoresis
Step 2 digest a BAC or YAC with same endonuclease
Step 3 ligate genome and vector DNA together and transform into
yeast or bacteria so have library of cells containing genomic DNA
Complementary DNA library
Step 1 extract mRNA form an organism or a tissue
Step 2 reverse transcriptase PCR
Step 3 Clone DNA into vectors to make cDNA library
Clones only those parts fo the genome that are expressed
Get rid of all introns
11
Sequence or Structural relationships for Protein function
‘Comparative genomics’
Compare newly discovered gene to genes of known function
Genes with similar sequence of function from different spieces
Orthologs
Genes with similar sequence of function in same species
Paralogs
Easiest to compare in similar species (human to mouse)
But orthologs observed between bacteria and humans
conserved gene order on a chromosome observed in closely related
species
Synteny
Sometimes see structural motifs that point to function
ATP binding domains in ATPases
Finger fingers in DNA binding proteins
Fusion Proteins and Immunoflorescence to localize protein in cell
Location in cell can give clue to function
Green Fluorescent protein (GFP)
Derived from jellyfish Aequorea victoria
Fluorophore in center of a beta arrel
Just need O2 to flresce
Fuse onto protein of interest
Use microscope to see location in cell
Several proteins with different color now available
Figure 9-16
Alternate method
Fuse protein of interest with short protein sequence that has well
characterized antibodies
Called epitope tage
Kill and fix cell on microscope slide
Attach flurophore to antibody
Allow antibody to find target
Target location now fluresces
Figure 9-17
Similar method, don’t alter protein
But raise antibody to it
And have fluorescent antibody that binds to first antibody
Can do these things for entire library!
12
Protein-Protein Interactions
make protein with epitope tag
In cell precipitate protein with antibody
See if any other proteins precipitate out because in complex with first
protein
Many variations
Tandem affinity purification Figure 9-20
Yeast two-hybrid analysis Figure 9-21 Skip
DNA Microarrays
short DNA segments from genes of know sequence (50-100 of bp
all PCR’s together
Robotic device add nonliter drops of solution in a predesigned array onto
solid surface that binds DNA
Or simply synthesize DNA onto solid surface Figure 9-22
Resulting array called a chip
May include a sequence form every gene in an organism
Probe chip with mRNA or cDNA from an organism or tissue in a particular
state
Provides researcher with snapshot of all genes being expressed at
that time
Figure 9-23
Figure 9-24
9.3 Genomic and Human Story
2 complete human genomes published in 2001
Watson-Collins - publically funded
Venter- Privately funded
reflects about 10 years of sequencing by methods give so far
New generation of DNA sequencing
New technology - “Next gen” sequencers
bacterial genome a few hours
Human genome a day or two
Step 1 Shear genome to randomly generate fragments of a few hundred
base pairs
Step 2 add synthetic sequences of DNA to end of all DNA
This give you a reference point
Step 3 Immobilize DNA on a matrix
Step 4 PCR amplify all DNA sequences
Now have microarray of millions of DNA fragments
13
2 different ways to sequence all these fragments
Pyrosequencing Figure 9-25
Flush chip with each of the dNTP’s in turn
And then use apyrase to degrade unreacted base
If could not add to DNA because not next in sequence,
nothing happens
If could add to DNA because next in sequence would
release Ppi
Sulfurylase converts Ppi to ATP
ATP reacts with ATP to create flash of light
Watch chip to see where bases are added
Reversible terminator sequencing Figure 9-26
1. Add blocked fluorecently labeled nucleotides
Use fluoresence to see what nucleotides added
where
2. Remove labels and blocking groups
3. Wash
Repeat step 1 etc
And I will quit there
Download