Nucleic acids Chemistry 256

advertisement
Nucleic acids
Chemistry 256
Nucleic acids = polynucleotide
• The term includes both DNA (deoxyribonucleic
acid) and RNA (ribonucleic acid)
• Polymers of nucleotides
• Nucleotide = nitrogenous base linked to a
sugar linked to at least one phosphate
The nitrogenous base distinguishes the
molecules; there are only five “interesting” ones
Note the numbering
system of atoms in the
structures; these are
not-prime (as opposed
to the primed numbers
in the ribose moiety).
Sugar = ribose
• “Deoxy-” refers to the 2’ carbon does not have a
hydroxy group. The carbons on the ribose are
“primed” because the nitrogenous base’s carbons
are considered the primary chain on the molecule.
Complete nucleotide
• The 3’ and 5’ carbons’ hydroxy groups are the
attachment points for phosphate groups.
The phosphates link the
nucleosides together
• Phosphodiester bonds.
• Like proteins, the polymer
is asymmetric.
• The convention is to read
the sequence of
nucleosides from the 5’
end (the end with the free
phosphate on the 5’
carbon) to the 3’ end.
• Reflects how enzymes
“read” the sequence.
Schematic nucleic acid polymer representation
• Nucleosides = ribose + base; A = adenosine, C = cytidine, G =
guanosine, T = thymidine
• Nucleotides = ribose + base + phosphates; A = adenine, C =
cytosine, G = guanine, T = thymine (these are actually the base
names, but hardly anyone uses the full nucleotide name, like
cytidine monophosphate (CMP))
In addition, U =
uracil (for RNA – no
T in RNA)
Brief history of DNA discovery
• Erwin Chargaff (Columbia University) – “The
Separation and Quantitative Estimation of Purines
and Pyrimidines in Minute Amounts”, J. Biol.
Chem.(1948) found that the number of A nucleotides
in a sequence of DNA equals the number of T
nucleotides; similarly, # of C = # of G (“Chargaff’s
rules”).
• However, percentage of A+T does not equal
percentage of C+G; for mammals, % G+C is about 39
– 46%.
Double helix
• Though James Watson and Francis Crick (Cambridge
University) publish “A Structure for Deoxyribose
Nucleic Acid” in Nature (1953) and are credited with
coming up with the double-helical structure of DNA,
they benefitted from many researchers’ work.
Tautomeric form of bases
• Jerry Donahue
(Cambridge) found
that the bases can
exist as keto or enol
tautomers, but the
bases in DNA only
exist as keto
tautomers.
• Led to the idea of
hydrogen bonding
between two DNA
strands.
X-ray crystallography
• Maurice Wilkins (who
won the 1962 Nobel Prize,
along with Watson and
Crick), Alex Stokes and
Herbert Wilson (King’s
College), and Rosalind
Franklin and Raymond
Gosling (King’s College)
publish simultaneous
papers in Nature (1953)
showing X-ray
crystallographic evidence
of double helical
structure.
Ironically, Linus Pauling had published a
proposed structure for nucleic acids – a triple
helix – earlier in the year, because he did not
have good X-ray crystallography data.
The details of the double helix
• The two strands of DNA run antiparallel (Watson and Crick).
• The two strands share a common
helix around which they wind;
there is a major groove and a
minor groove (Wilkins et al.).
• The bases are in the core of the
structure and the phosphates
make up the “backbone” of the
helices (Franklin and Gosling).
• The bases form hydrogen bonds
with complementary bases
(“base pairing”) to connect two
strands (Watson and Crick).
Double helix has a handedness
• DNA
(both
helices) is
righthanded.
Base pairing
• Note that it is always a
purine base pairing with
a pyrimidine base.
• A/T pair is held together
by two hydrogen bonds;
G/C pair, three hydrogen
bonds.
• Thus, one DNA strand
can act as a template for
building the other strand.
Describing DNA organization
• Genome  chromosomes  genes
• Measure DNA length in terms of base pairs
(bp) or bases (b); a kilobase (kb) is a
commonly used unit. 250000 kb in
mammalian DNA.
• Each chromosome is a separate DNA
molecule.
• Two sets of identical chromosome = diploid;
one set = haploid.
RNA can form stem-loops
• RNA is singlestranded.
• Segments within
RNA are selfcomplementary
and can base pair
with other
segments to form
“stem-loops”.
• Important
structure in tRNA,
for instance.
Brief history of DNA function
• Gregor Mendel (1866) – “factors” passed on by
inheritance
• Fredrick Griffith (1928) – living nonvirulent
pneumococci bacteria added to dead virulent
pneumococci leads to living virulent bacteria.
“transformation” by “Inheritance molecule”.
• Oswald Avery, Colin MacLeod and Maclyn
McCarty (1944) – transformation of nonvirulent
pneumococci to virulence by DNA transfer
Brief history of DNA function
• George Beadle and Edward Tatum (1941) found that
genes lead to the production of enzymes. “One gene,
one enzyme” theory.
• Francis Crick (1958) – “Central dogma” of biology
(DNA RNA proteins)
• Sidney Brenner and Francis Crick (1961) – The codon
is a three-nucleotide sequence in DNA that “codes”
for a particular amino acid.
• Marshall Nirenberg (1961) – deciphers first codon
(UUU codes for phenylalanine); in five years, he and
coworkers decode the rest of the genetic code.
Basic ideas behind the central dogma
• DNA is replicated
during cell division to
make a copy of the
DNA.
Basic ideas behind the central dogma
• When the gene
is expressed, an
messenger RNA
(mRNA) copy is
transcribed
from the DNA.
• Finally, the
mRNA is
transported to
the ribosome
where it is
translated into
a peptide
sequence using
transfer RNAs
(tRNA).
Detail of translation of proteins
Sequencing DNA
• To determine the 5’ to 3’
sequence of the nucleotides
in a DNA strand, one could
hydrolyze the
phosphodiester bond on
either side and remove one
nucleotide at a time and
identify it.
• Robert Holley and others
(“Nucleotide Sequences in
the Yeast Alanine Transfer
Ribonucleic Acid”, J. Biol.
Chem. (1965)) took seven
years to sequence a 76nucleotide tRNA.
Faster way
• Restriction
endonucleases exist
in bacteria to resist
the actions of
bacteriophages.
• These enzymes
cleave at a particular
DNA sequence,
typically 4 or 6
nucleotides long.
• Smith, Kelly and
Wilcox (1970) isolate
the first enzyme
HindII, for which
they share the Nobel
Prize in 1978.
Restriction mapping
• By overlapping
restriction “sites”
on the target DNA,
a length of DNA
can be ordered by
electrophoresis.
• Fragments thus
cleaved can be
“blunt-ended” or
“sticky-ended”.
Gel
electrophoresis
• DNA is negatively
charged so it will be
attracted to the
anode of an
electrical cell.
• Like protein gels,
the migration
distance is inversely
proportional to the
size (molecular
weight) of the DNA
fragment.
Chain terminator (dideoxy) method
• The discovery of DNA polymerase I by Arthur Kornberg
(Washington University) in 1956 allowed Walter Gilbert
(Harvard) and Frederick Sanger (Cambridge) twenty years
later to develop a much quicker sequencing system.
Make every possible chain length
• Use DNA
polymerase to
make copies of
the target DNA,
use radioactively
labeled dideoxy
(ddNTP) to end
the strand
growth, then
electrophoresis
to separate
different lengths.
Dideoxynucleotides (ddNTP)
• Terminate chain growth because no 3’
hydroxy group to attach a phosphate!
Just read it
off!
• Automated
at this point
in time, can
sequence
kb in
minutes.
Shotgun sequencing
The limit of the
Sanger method is <
1000 bp. By breaking
up the original
sequence into
multiple random
small fragments
(<1000 bp), and doing
the breakup several
times, and
sequencing all the
fragments,
overlapping
sequences can be
combined to yield the
overall sequence.
Human genome project
• In 1990, the US DOE and NIH started a $3 billion
project to map the 3.3 billion bases of the human
genome.
• In 1998, a more modest private effort was started by
Craig Venter and Celera Genomics.
• Between the two, the complete “final draft” human
genomic sequence was published in 2003, a decade
earlier than predicted.
• President Clinton signed an executive order that
stated the human genome could not be patented;
Celera stock plummets.
Human genome project results
• 1. There are approximately 23,000 genes in human
•
•
•
•
beings, the same range as in mice and roundworms.
2. Between 1.1% to 1.4% of the genome sequence codes
for proteins
3. The human genome has significantly more duplicated
segments within it that other mammalian genomes do.
These sections may be the source of new primatespecific genes.
4. At the time when the draft sequence was published
less than 7% of protein families appeared to be
vertebrate-specific.
5. Two randomly selected human genomes differ, on
average, by only 1 nucleotide per 1250.
Cloning simply means adding a section of foreign
DNA into a vector DNA (and having it copied)
• Typical vectors
are plasmids,
single-stranded
circular DNA
that replicate
within bacteria.
• Plasmids can’t
accept foreign
DNA larger than
10 kb.
• Ligase is the
enzyme used to
repair open
section of vector
DNA.
How to tell if the foreign DNA has “taken”
• Disrupt a non-critical
gene in the vector – for
instance the betagalactosidase gene that
will result in the
vector’s inability to
cleave a galactose’s
anomeric bond (and
thus keep colonies of
that form of the vector
colorless, instead of
turning blue).
Download