DNA structure DNA is usually a double

advertisement
DNA structure
DNA is usually a double-helix and has two strands running in opposite directions. (There are
some examples of viral DNA which are single-stranded). Each chain is a polymer of subunits
called nucleotides (hence the name polynucleotide).
Each strand has a backbone made up of (deoxy-ribose) sugar molecules linked together by
phosphate groups. The 3' C of a sugar molecule is connected through a phosphate group to
the 5' C of the next sugar. This linkage is also called 3'-5' phosphodiester linkage. All DNA
strands are read from the 5' to the 3' end where the 5' end terminates in a phosphate group and
the 3' end terminates in a sugar molecule.
Each sugar molecule is covalently linked to one of 4 possible bases (Adenine, Guanine,
Cytosine and Thymine). A and G are double-ringed larger molecules (called purines); C and
T are single-ringed smaller molecules (called pyrimidines).
In the double-stranded DNA, the two strands run in opposite directions and the bases pair up
such that A always pairs with T and G always pairs with C. The A-T base-pair has 2
hydrogen bonds and the G-C base-pair has 3 hydrogen bonds. The G-C interaction is
therefore stronger (by about 30%) than A-T, and A-T rich regions of DNA are more prone to
thermal fluctuations.
The bases are oriented perpendicular to the helix axis. They are hydrophobic in the direction
perpendicular to the plane of the bases (cannot form hydrogen bonds with water). The
interaction energy between two bases in a double-helical structure is therefore a combination
of hydrogen-bonding between complementary bases, and hydrophobic interactions between
the neighboring stacks of base-pairs.
Even in the single-stranded state, the bases prefer to be stacked (like the steps of a spiral
staircase if the bases are identical) and a single-stranded chain can also have regions of
helical conformation.
The backbone of polynucleotides are highly charged (1 unit negative charge for each
phosphate group; 2 negative charges per base-pair). If there is no salt in the surrounding
medium, there is a strong repulsion between the two strands and they will fall apart.
Therefore counter-ions are essential for the double-helical structure. Counter-ions shield the
charges on the sugar-phosphate backbone. They may also contribute to an attractive
interaction from fluctuating counter-ions around the backbone, similar to the Van der Waals
interactions for fluctuating induced dipoles.
The most common DNA structure in solution is the B-DNA. Under conditions of applied
force or twists in the DNA, or under low hydration conditions, it can adopt several helical
conformations, referred to as the A-DNA, Z-DNA, S-DNA...
Shown in picture above are three crystallized states of DNA, the A-DNA (left), B-DNA
(middle) and Z-DNA (right). The A-form crystallizes under low hydration conditions and is
not normally found for DNA in the cell. It is, however, the structure adopted by doublestranded regions in RNA as well as the transient double-helix between DNA and RNA during
transcription. Both A- and B-DNA are right-handed helices whereas Z-DNA is a left-handed
helix and is commonly found in regions of DNA that have an alternating purine-pyrimidine
(e.g. 5'-CGCGCGCG-3' or 5'-CGCGCATGC-3') sequences. The table below summarizes
some of the major differences.
A-DNA B-DNA Z-DNA
Right-handed helix Right-handed Left-handed
Short and broad Long and thin Longer and thinner
Helix Diameter 25.5A 23.7A 18.4A
Rise / base-pair 2.3A 3.4A 3.8A
Base-pair / helical turn ~ 11 ~ 10 ~ 12
Helix pitch 25A 34A 47A
Tilt of the bases 20 deg -1 deg -9 deg
The ball-and-stick representation shown above can be misleading because it suggests that
there is empty space between the two strands and between the base-pair stacks. Another
representation is the filled space representation in which each of the atoms are shown as a
ball of radius representative of its Van der waals radius. The picture below shows this view
for the 3 DNA structures shown above.
Here, the B-DNA is on the left and the A-DNA is in the middle. The blue and white atoms
are the sugar-phosphate backbone atoms, the red are G-C base-pairs and the yellow are A-T
base-pairs. The B-DNA picture shows very clearly the 'grooves' in between the backbones
that also spiral around the DNA structure; the grooves in B-DNA come in two sizes, the
minor groove and the major groove.
A DNA molecule is not a rigid, static structure as x-ray diffraction pictures might suggest,
and the crystallographic parameters shown above are average parameters. In reality, each of
these structures are under constant thermal fluctuations, which result in local twisting,
stretching, bending, and unwinding of the double-strands. Also, certain sequences lead to
permanent bends or kinks in the direction of the helix. These local (sequence-specific)
fluctuations are essential for the recognition of specific binding sites along the DNA
molecule where proteins involved in replication, transcription, regulation of gene expression,
or DNA-damage repair can bind.
RNA structures
RNA molecules are also polynucleotides with a sugar-phosphate backbone and four kinds of
bases. The main differences between RNA and DNA are:



RNA molecules are single-stranded
The sugar in RNA is a ribose sugar (as opposed to deoxy-ribose) and has an �OH at
the 2' C position highlighted in red in the figure below (DNA sugars have �H at that
position)
Thymine in DNA is replaced by Uracil in RNA. T has a methyl (-CH3) group instead
of the H atom shown in red in U.
The picture shows an ATP molecule (adenosine tri-phosphate) about to be incorporated into
an RNA chain with the release of a di-phosphate).
RNA molecules do not have a regular helical structure like DNA. Instead, they can form
complicated 3-dimensional structures where the strands can loop back and form intra-strand
base-pairs from self-complementary regions along the chain.
DNA structure
RNA structure
There are three classes of RNA molecules:

messenger RNA (mRNA) which acts as a template for protein synthesis and has the
same sequence of bases (read from the 5' to the 3' end) as the DNA strand that has the
gene sequence. mRNA can range from ~300 nucleotides to ~7000 nucleotides,
depending on the size and the number of proteins that they are coding for.

transfer RNA (tRNA), one for each triplet codon that codes for a specific amino-acid
(the building blocks of proteins). tRNA molecules are covalently attached to the
corresponding amino-acid at one end, and at the other end they have a triplet sequence
(called the anti-codon) that is complementary to the triplet codon on the mRNA. All
tRNA molecules are in the range ~70-90 nucleotides. They have a molecular weight
of ~25,000 and have sedimentation constant ~ 4 Svedberg (S) units.

ribosomal RNA (rRNA) which make up an integral part of the ribosome, the protein
synthesis machinery in the cell.
Secondary and tertiary structures of tRNA
molecules
The crystal structures of several tRNA molecules have been determined. All tRNA molecules
have very similar secondary structures in which the single-stranded chain is folded in a
'clover-leaf' structure that has three hairpins and an acceptor stem where the amino-acid is
covalently attached. The acceptor stem is the 3' end of the chain and always terminates in the
sequence 5'-CCA-3'.
This particular tRNA is specific for the amino-acid Alanine whose codon on the mRNA is 5'GCC-3' and the anti-codon loop of tRNA reads 5'-GGC-3'. The grey circles are examples of
unusual, chemically modified, bases.
The secondary structure then folds up to form a 3-dimensional structure which looks like an
inverted L.
One end of one L arm (the 3' end of the chain) is the acceptor stem. The other end of the L is
the anti-codon loop that has to match the codon on the mRNA. The distance between the two
ends of the L is ~ 7 nm. The corner of the L is used for correct positioning on the ribosome
where the protein synthesis takes place.
In the tertiary (3-dimensional) structures of RNA, bases sometimes make hydrogen bonds
with more than one partner, as illustrated in the picture above. These extra hydrogen bonds
compensate for the distortion in the double-stranded helical regions when the RNA folds up
and help stabilize the tertiary structure.
The covalent attachment between the tRNA and its corresponding amino-acid is achieved by
yet another adaptor molecule (this time a protein molecule called aminoacyl-tRNA
synthetase) of which there are at least 20 varieties, one for each kind of amino-acid. The
synthetases recognize the detailed shape and properties of a specific amino-acid and the
detailed shape of the acceptor stem in the folded tRNA molecule and catalyze the covalent
attachment between the amino-acid and its corresponding tRNA.
Ribosomal RNA
The ribosome is a large machinery (~ 20 nm in diameter, 70S sedimentation rate for bacterial
ribosomes) and is made of two subunits: a large subunit (~50S) and a small subunit (~ 30S).
The large subunit is in turn made of two ribosomal RNA (5S and 23S) and several (~34
proteins) whereas the small subunit has one ribosomal RNA (16S) and ~ 21 proteins. The 23S
rRNA is ~ 3000 nucleotides long, and the 16S rRNA is ~ 1500 nucleotides long.
The structures of ribosomal RNA can get very complicated because of the large number of
ways in which hairpins and loops can be formed. Predicting these structures requires a
combination of both computational methods (in which the most probable secondary
structures are determined from estimates of free energy for a given structure) and a variety of
experimental techniques.
Oligonucleotide mapping techniques
This technique is useful in identifying exposed single-stranded regions of a folded RNA
molecule by hybridization with short synthesized nucleotide chains (also called
oligonucleotides) that are complementary to, for instance, the loop regions in RNA.
Folded RNA molecules are confined to one region in space separated by another region by a
semi-permeable membrane. On the other side of the partition are radioactive oligonucleotides
(~ 5-10 nucleotides long) that can pass through the membrane and bind to RNA molecules,
but the RNA molecules, which are much bigger in size, cannot.
At equilibrium, free oligomers are in the same concentration on both sides of the partition.
However, the radioactivity on the side with the RNA molecules is larger than the other size
because some oligomers will associate with (bind to) RNA if the sequences of oligomers and
loop regions are complementary. A measure of the ratio (rd) of radioactivity from either side
gives a measure of the binding or association constant
where [X] is the
concentration of the RNA-oligomer complex, [O] is the free oligomer concentration on either
side, and [RNA] is the concentration of molecules that are not bound to an oligomer.
The ratio
If [RNA] >> [O], then the RNA concentration can be assumed the same before and after
mixing and the ratio becomes
.
Therefore, a measurement of rd yields a direct measure of Ka.
All oilgonucleotides will lead to some association since there is always a match at a single
base-pair level. Therefore
for any oligonucleotide. For oligonucleotides ~ 4 bases
long that match an exposed loop region on the RNA the free energy change upon association
is substantially larger (by ~ 10-15 kBT ) than the free energy change from single base-pair
matches. This lead to an increase in the association constant by a factor of 104 to 106.
This technique can easily distinguish between two possible conformations of an RNA
molecule which have different sequences in their loop regions.
We can also estimate which structure is more probable (i.e. which one has the lower free
energy.
The free energy of a hairpin can be broken into two parts, the free energy of forming a loop
closed by a single base-pair
hairpin.
and the free energy for the base-paired `stem' of the
In RNA molecules the most probable loop size consists of ~ 6-7 bases in the loop. Smaller
loops are energetically unfavorable as a result of steric hindrances among the bases and atoms
of the backbone. Larger loops are entropically unfavorable. The loss of entropy when loops
are formed increases with increasing loop size.
for the optimal sized loop closed by a G-C base-pair is ~ 7-8 kBT in 1M NaCl. In
our example we have a loop with 10 bases in structure 1 (
with 4 bases each in structure 2 (
) and 2 loops
for each loop).
Note that
is a positive quantity; it is unfavorable to make loops relative to the
random coil conformation.
The hairpin structures are stabilized when the free energy gain from base-pair
formation exceeds the free energy cost of loop formation.
The gain
from adding a base-pair to an already existing G-C pair is ~
adding a G-C base-pair and ~
for
for adding a A-U base-pair.
Therefore the net change in free energy for structure 1 is
and for structure 2 is
Structure 1 is more stable (although marginally) and the relative populations of the two
structures are given by the Boltzmann distribution
Download