Recombinant DNA Technology for the non

advertisement
Introduction to
DNA Sequencing
Technology
Dideoxy Sequencing (Sanger
Sequencing, Chain Terminator
method).
• Clone the fragments to be sequenced
into the virus M13.
• Why M13?
• The clones that are isolated are singlestranded DNA.
Primer
^^^
. . . . . TGATGTCGAGCGAGTCGTACGGT----Fragment to be deciphered
M13
DNA sequencing reaction:
1) DNA fragment to be sequenced cloned into the
vector M13
2) DNA polymerase
3) “Universal” primer
4) All 4 DNA building blocks
5) One ddNTP tagged with a radioactive tracer
The most popular technique is
based on the dideoxynucleotide.
Purine
Pyrimidine
Set up 4 separate reactions. Each
reaction contians one of the 4
ddNTPs. Each ddNTP is tagged
with a radioactive tracer.
A reaction (with ddA)  21, 26, 29, . . . .
T reaction (with ddT)  25, 31, 35, . . . . .
C reaction (with ddC)  22, 23, 27, . . . .
G reaction (with ddG)  ??
(3’ end of primer)
Primer (20 nt.)
^^^
. . . . . TGATGTCGAGCGAGTCGTACGGT-----
M13
• Each reaction generates a set of unique
fragment lengths.
• All fragment lengths are represented
(from 21 - > 1,000 nucleotides).
• None of the fragments are present in
more than one reaction.
• DNA sequencing technology
requires gel electrophoresis
system with the ability to
separate DNA fragments that
separate by one b.p.
DNA sequencing, as performed
in the 1980s (manually) is slow
and labor intensive.
• NCBI HomePage
~1988- First big change in DNA
sequencing technology:
• Introduction of ‘automated DNA
sequencing’:
•
This technique uses 4 fluorescent labels (red,
yellow, blue, green) rather than one radioactive
tag.
• The bases are read by a laser/detector rather than
by humans.
• York University
? Questions ?
Newest Innovations in
DNA Sequencing
Technology
• 1) Capillary Electrophoresis
• 2) Robotics
• CE Theory
Capillary Gel Electrophoresis:
“The capillaries we typically use in CE are
inexpensive and commercially available.
We use capillaries that range about 30 to 50
centimeters in length, 0.150 to 0.375
millimeters in outer diameter, and a 0.010 to
0.075 millimeter diameter channel down the
center. “
DNA sequencing with CE
# of capillary tubes/machine:
Initally- one (Introduced ~ 1998)
State of the Art- 2000: 96 tube CE (cost $300k)
Today- 384 tube CE (cost of one unit- $500k)
• DOE Joint Genome Institute
HUMAN GENOME PROJECT
(HGP)
• The ultimate goal of the HGP is to
decipher the 3.3 billion b.p. of the
human genome.
• When the project was initiated, its
was technologically unfeasible.
Genomic Sequencing
Organisms sequenced
• Year # genomes sequenced
• 1994 0
• 1995 2
• 1996 4
• 1997 8 (est.)
• 1998 30 (est.)
• 2001 ~75
Genomics Research Funding
(selected programs; $ millions)
PROGRAM
NHGRI (U.S.)
WELCOME
TRUST (U.K.)
STA (JAPAN)
ENERGY
(U.S.)
GHGP
SWEDEN
1998
211
61
2000
326
121
39
85
115
89
19
5
79
35
Why such a sudden increase in
funding??
• It became apparent that if the
public agencies didn’t get their act
together, an upstart organization
might sequence the HG before they
did (despite their ~ 8 year head
start).
Sequencing the human genome
suddenly had become a race.
• The competitors:
• Publicly funded genome centers,
scattered throughout the U.S.,
Europe, and Japan.
• Celera, the private company
directed by J. C raig Venter.
The story of how J. Craig
Venter brought about a
paradigm shift in genomic
sequencing has now entered
the mythology of science.
Craig Venter
Scientist of the Year
What was perhaps the most
important scientific event of the past
century occurred this year when scientists
announced the cracking of the human
genetic code. And what everyone, including
his numerous critics, acknowledges is that
the brash and impatient Venter is the man
who made it happen years before it would
have otherwise by throwing computing
power at the traditional, laborious process
of manually examining every bit of human
DNA to find the genes within.
• from Time Magazine:
Why did Craig Venter and his new
company Celera threaten the
established genome sequencers?
• Venter’s new company had 300 $300k
state-of-the-art sequencing machines
and an $80 million dollar
supercomputer.
• Venter suggested Celera could
sequence the genome in but 3 years at
a cost of $300 million.
Venter’s first company, TIGR,
pioneered the ‘shotgun
sequencing’ approach to
sequencing a genome:
• 1) Shear the DNA into thousands of random pieces.
• 2) Sequence the DNA of each fragment.
• 3) Use a computer to align the overlapping
fragments to produce a single, contiguous DNA
sequence of the entire organism.
Advantages/Disadvantages of the
‘shotgun approach’:
DisadvantagesRequires significant over-sequencing
Requires powerful alignment software
There may be problems ‘finishing’ certain
regions
AdvantagesEliminates the needing for mapping
Sequencing of Archaeoglobus
fulgidus:
• 29,000 sequencing reactions
• 500 bp. Average ‘read’
• 14,500,000 bases aligned  2,178,400 bp.
• 6.7- fold sequence coverage
(14,500,000 / 2,178,400 = 6.7)
Even with remarkable success
sequencing bacterial genomes,
skeptics doubted a whole genome
random sequencing approach would
work with a eukaryotic genome.
Why?
2 Reasons• Eukaryotic genomes are much larger.
• Eukaryotic genomes carry significant amounts
of repetetive DNA.
Who won the race?
• With much fanfare, the rough draft
of the human genome was
‘declared’ a draw. Both Celera and
the various public agencies shared
credit for the rough draft of the
human genome (‘announced Feb.
2000).
Insert Video (10’)
What is meant by the term
mapping?
• Mapping to a geneticist means the same as
it does to a non-scientist:
• A drawing showing the spatial relationship
between a series of points.
Traditional map:
Western U.S.SeattlePortlandS.F.
-
Gene Map:
Human Chromosome # 11
Hemoglobin-b
Insulin
Parathyroid
Hormone
Albinism
L. A. -
Mouse Clickable Cytogenetic Map
Chromosome X is selected
Restriction Enzyme Map
HinDIII
•
EcoRI
HinDIII
HinDIII
____|__________|________|_________|_
Construction of various maps has
been a major goal of genetic
research. Why?
• Maps serve as navigational tools. They are
useful in finding genes or other genetic
features and ordering fragments of DNA.
• There is a direct correlation between the
usefulness of a map, and the number of
points on the map. Analogy??
The STS map:
• STS = sequence-tagged site.
• STS are short, unique fragments of DNA generated
by PCR.
• Verification of a human STS: PCR amplification of
the human genome generates one small fragment 
unique lanckmark
Usefulness of STSs
• STSs are used to find overlaps between
fragments of genomic DNA.
• Finding overlaps  ordering of
fragments (see handout).
Expressed Sequence Tags (ESTs)
• As of June 2000, the 4.6 million
EST records comprised 62% of the
sequences in GenBank. Although
the original ESTs were of human
origin, NCBI’s EST database
(dbEST) mow contains ESTs from
over 250 organisms.
What is an EST?
Short DNA sequence representing a gene
expressed in a particular tissue. A given
EST often represents a fraction of the gene.
ESTs are often produced by sequencing the
ends of a cDNA (complementary DNA).
What is the value of ESTs?
• Rapid identification of genes.
Feb. 1992- Craig Venter and 14 co-workers
published the partial DNA sequence of of
2,375 genes expressed in the human brain.
This represented about half of the total
human genes known at the time.
How to sequence a
genome???
• 1) Quickly- focus on the genes and
their regulatory regions and human
polymorphisms.
• 2) Thoroughly and completely- every
nucleotide with 99.99% accuracy.
Extra Slides 
• Does completion of HGP  identification
of all disease genes?
• A Timeline of The Human Genome
• YEAR# human genes mapped to a definite
chromosome location# years it would take to
sequence the human genome
• 1967
none
sequencing not possible yet
• 1977
3 genes mapped
• 4,000,000 years to finish at 1977 rate
• 1987
12 genes mapped
• 1000 years to finish at 1987 rate
First Sequenced Genome:
• May 1995, TIGR researchers led by Robert
Fleischmann closed the last gaps in the
Haemophilus genome. In total, 26,708
sequences had been assembled to span the
1,830,137 base pair genome of the
bacterium. The genome was published in
July. (Fleischmann, et al, Science, 269: 496512, 1995).
• DNALC: Cycle Sequencing
In the February 16 issue of Science, Venter et
al. announce the sequencing of the euchromatic
portion of the human genome by a wholegenome shotgun sequencing approach. The
sequencing achievement was accomplished by
Celera Genomics in nine months in a factoryscale project involving 300 automatic squencing
machines producing 175,000 sequence-reads
per day. The company generated 14.8
gigabases (Gb) of DNA sequence and combined
data with the public GenBank database to
generate a 2.91 Gb consensus sequence (94%
coverage) representing over eight-fold coverage
of the genome.
Download