SEQUENCING AND THE HUMAN GENOME PROJECT

advertisement
HUMAN GENOME PROJECT
What is the Human Genome Project?
Goal: Sequence all of the nucleotides in the human
DNA sequence (“genome”).
HUMAN GENOME PROJECT
Why: A. To understand how genes work.
B. To understand why some genes don’t work.
HUMAN GENOME PROJECT
Who: A. National Institutes of Health, Dept. of Energy
B. International Project
When: A. 1990; finish in 15 years
HUMAN GENOME PROJECT
When: B. First chromosome sequenced (22) - 1996
C. 1/3 of genome
completed 1999
“Cracking the
Code of Life”
Chapter 7 first
segment
NOTE – all Cracking the Code segments can be found at
http://www.pbs.org/wgbh/nova/genome/program.html
Celera
A. Private company founded by Craig Venter,
former NIH scientist
B. Finish project in 2 years?
“Cracking the Code of
Life” Chapter 4 (6:29)
How do you profit from sequencing
the human genome?
“Cracking the Code of Life” Chapter 8 (4:06)
Sell information for scientists to look at.
Eventually, public project will complete HGP, so
what do you sell then?
“Cracking the Code of Life” Chapter 7 second
segment (49:10).
How do you profit? continued
Patenting DNA sequences – is this right?
Whose data is it?
Does patenting DNA sequences
encourage or discourage research from
being done?
Who won?
Both groups shared credit for “finishing”
the HGP in 2001.
Competition sped up sequencing process.
“Cracking the Code of Life” link?
What Have We Learned From
HGP?
Humans are 99.9% identical.
Total number of genes ~ 30,000. This
doesn’t match the number of proteins
(over 100,000) so each gene must be able
to code for more than one protein.
Over 50% of genes have unknown
functions.
What Have We Learned From
HGP?
Less than 2% of DNA codes for genes.
Most genes are clustered in “urban
centers” (not randomly spread out).
Over 50% of DNA is “not human” –
hitchhiking “junk” DNA.
What’s next?
Gene regulation – how do genes know
when to turn on and off?
Proteome – what proteins do these genes
code for and what do the proteins do?
Personalized medicine – medications to
treat you based on your genetics.
What’s next?
Copy Number Variant – reading
SNP’s – reading
Epigenetics – reading
STR’s - lab
How does sequencing work?
The Key
Missing oxygen #2
=
dideoxyribonucleic acid
Missing oxygen
=
deoxyribonucleic acid
This is a nucleotide called a
dideoxynucleotide.
Why are dideoxynucleotides
important?
Since there is no oxygen on the 3’ end, no
additional nucleotides can be added.
DNA Synthesis is stopped.
What is needed for a Sequencing
Reaction?
Original DNA
Nucleotides
Primer
DNA Polymerase
“Detectable” dideoxynucleotides
(radioactivity or fluorescence)
Now it’s your turn to sequence!
How does a Sequencing Reaction
work?
www.dnai.org
- manipulation
- techniques
- sorting and sequencing
- cycle sequencing
Three steps
1. Denaturing – 950C
2. Annealing – 500C
3. Extension – 600C
Only one cycle
so do not need
to use expensive
Taq polymerase
How does a Sequencing Reaction
work?
Nucleotides are randomly selected by DNA
Polymerase.
Sequencing is stopped when ddNTP is randomly
selected.
Sequences of varying lengths are produced.
How would we separate these differently sized
pieces?
How does a Sequencing Reaction
work?
Gel Electrophoresis
Laser detects the
fluorescence of each
ddNTP
Computer records the
order of the colors
(order of the bases)
How does a Sequencing Reaction
work?
Results are presented as
an “electropherogram”.
www.dnai.org
- manipulation
- techniques
- Interview
“Inside an automated
sequencer”.
Sequencing Process Review
Sequencing Animation
Now it’s your turn to sequence,
Part 2!
How do you sequence so many
letters so quickly?
Shotgun sequencing –
divide many copies of
genome into small
bits. Sequence each
fragment. Use
computers to align
sequence.
How do you sequence so many
letters so quickly?
www.dnai.org
- genome
- The Project
- Putting It Together
- Animations
- Whole Genome Shotgun (private)
How do you sequence so many
letters so quickly?
www.dnai.org
- genome
- The Project
- Putting It Together
- Sequencing Game
So what can you conclude about
shotgun sequencing?
Overlapping provides a context. (unlike first
Mouse and Cookie sentence fragments).
Requires multiple copies each copy cut with a
different restriction enzyme to generate
overlapping pieces
Up to 8% of human genome remains
unsequenced due to highly repetitive sections
(especially ends and middles– telomeres and
centromeres).
Whose DNA was sequenced?
Public – a random couple from Buffalo, NY
Celera – random, nameless volunteers
(though Dr. Venter’s DNA was “randomly”
selected
What’s next?
To learn which sequences lead to genetic
disorders, many different human genomes
need to be sequenced.
Which is more important to
studying genetic disease?
Sequences that are the same?
Sequences that are different?
WHY?
What are those differences called?
SNP’s – single nucleotide
polymorphisms; DNA
sequence that is one letter
different.
Develop “personalized
medicine” based on the exact
SNP causing genetic disorder.
Are SNP’s the whole story?
CNV’S – copy number variants; not everyone
has two copies of each gene.
Higher number of gene copies, higher level of
protein might be produced; not necessarily
good.
Ex. EGFR copy number can be higher than
normal in some types of lung cancer cells.
Copy Number Variants
What else is next?
Epigenome – changes made to DNA
structure without altering the sequence of
bases.
These changes quite often involve a
methyl (-CH3) group to tag or mark a gene.
Cell normally uses these methyl tags to
“turn off” a gene.
DNA or histones are
methylated.
Does this mean that identical twins
don’t have to be . . . Identical?
YES! Think of
the Agouti
mice.
Only difference
is what the
mom ate prior
to conception
and birth.
Other epigenome examples?
Let’s go to the video!
So what’s the sequencing
“revolution”?
Original sequencing
reactions used
radioactive ddNTP’s
not fluorescent.
Results looked like:
Problems with Radioactive
Sequencing
Very difficult to read
results
Cannot reuse a
machine exposed to
radioactivity in a
machine
Again, what’s the “revolution”?
Computers and fluorescent ddNTP’s
Machines can automatically run a sequencing
reaction.
Computers can store sequencing data.
Fluroescent ddNTP’s make machines reusable.
“100 letters in a day vs. 1000 letters every
second”
More HGP Info
Cracking the Code of Life – Chapters 4, 5, 6,
and 16
(http://www.pbs.org/wgbh/nova/genome/progra
m.html)
In order to sequence all DNA, Celera relied on
freely available DNA sequence from public
research group.
Who finished first – public or private research
group? When?
ELSI?
Ethical, Legal, and Social Issues
At the beginning of the project, genetic privacy
was one of the major concerns – as we learn
more about our own DNA sequences:
– Who should have access to that information?
– How do you help someone interpret that information
and decide how to act on it?
ELSI?
ELSI Video
Human Genome Project
Cracking the Code of Life – Chapter Two
“Getting the Letters Out”
http://www.pbs.org/wgbh/nova/genome/pr
ogram.html
Download