molecular_general_theory_complete

advertisement
Molecular Biology: General theory
Molecular Biology:
General Theory
Author: Dr Darshana Morar
Licensed under a Creative Commons Attribution license.
TABLE OF CONTENTS
INTRODUCTION........................................................................................................................................... 3
NUCLEIC ACIDS .......................................................................................................................................... 3
DNA .......................................................................................................................................................... 3
RNA Structure .......................................................................................................................................... 6
GENOME ORGANIZATION ......................................................................................................................... 6
Viruses ..................................................................................................................................................... 7
Prokaryotes .............................................................................................................................................. 7
Eukaryotes ............................................................................................................................................... 8
THE CENTRAL DOGMA OF MOLECULAR BIOLOGY .............................................................................. 9
DNA REPLICATION ................................................................................................................................... 10
The replication process in bacteria ........................................................................................................ 10
The replication process in viruses .......................................................................................................... 14
The replication process in Eukaryotes ................................................................................................... 15
TRANSCRIPTION ....................................................................................................................................... 16
Prokaryotes ............................................................................................................................................ 18
Eukaryotes ............................................................................................................................................. 18
TRANSLATION .......................................................................................................................................... 19
1|Page
Molecular Biology: General theory
REFERENCES ............................................................................................................................................ 21
The relevant websites as indicated in the course outline: ..................................................................... 21
A few general references: ...................................................................................................................... 21
2|Page
Molecular Biology: General theory
INTRODUCTION
Before the reader can gain an understanding of the practical applications of nucleic acids and proteins,
the basic structure, mode of action and functions of these complex structures should be known.
Two types of nucleic acids can be found in a eukaryotic cell; deoxyribonucleic acid (DNA) and ribonucleic
acid (RNA).
Deoxyribonucleic acid contains the genetic instructions needed to construct components of cells, such as
RNA and protein molecules. Therefore, DNA is used in the development and functioning of all known
living organisms and some viruses, with the primary role being long-term storage of information. Genes
are the DNA segments that carry this genetic information.
RNA differs in structure from DNA, but is transcribed (made) from DNA by enzymes called RNA
polymerases and is generally processed further by other enzymes. RNA is central to the translation
(synthesis) of some RNAs into proteins.
Proteins are large organic compounds made of amino acids arranged in a linear chain, joined together by
peptide bonds between the carboxyl and amino groups of adjacent amino acid residues. Like other
biological macromolecules such as polysaccharides and nucleic acids, proteins are essential parts of
organisms and participate in every process within cells. Many proteins are enzymes that catalyze
biochemical reactions and are vital to metabolism.
NUCLEIC ACIDS
DNA
DNA is made up of nucleotides. A nucleotide consists of a nitrogenous base (either a purine or a
pyrimidine), a five-carbon sugar (deoxyribose), and a phosphate group. The four nitrogenous bases found
in DNA are adenine (abbreviated A), cytosine (C), guanine (G) and thymine (T).
3|Page
Molecular Biology: General theory
Figure 1 Basic structure of a nucleotide
(Adapted from http://www.duke.edu/web/MAT/jennifer_sohn/unit/translation_oh.htm)
DNA is composed of a series of nucleotides, held together in a long linear chain by a phosphodiester
linkage from the 3’ hydroxyl group of one sugar to the 5’ phosphate group of the next (a DNA strand or
polynucleotide therefore comprises a sugar-phosphate backbone with attached bases). Since the
phosphodiester bonds form between the third and fifth carbon atoms of adjacent sugars, a strand of DNA
has a direction or orientation.
Most DNA is double-stranded: two chains that constitute a double helix. The two chains have opposite
orientations and are connected via hydrogen bridges between opposing base groups. Each type of base
on one strand forms a bond with just one type of base on the other strand. Adenine (purine) bonds only to
thymine (pyrimidine) by two hydrogen bonds, and guanine (purine) bonds only to cytosine (pyrimidine) by
three hydrogen bonds. This arrangement of two nucleotides binding together across the double helix is
called a base pair. The sequences of the two DNA chains are thus complementary.
4|Page
Molecular Biology: General theory
Figure 2 Basic structure of DNA
By convention the upper strand is written from 5' (five prime) to 3' (three prime). If DNA is transcribed, the
upper strand (known as the sense strand) corresponds to the RNA sequence. If this is messenger RNA
(mRNA) it contains the triplets of the genetic code, the coding strand. Normally only this upper strand is
listed in publications and databases.
In schematic drawings of gene structure the DNA is often represented as a straight line (or a circle for a
plasmid). Left to right always corresponds to the 5' to 3' orientation of the upper strand.
5' _______________________ 3'
This representation does not do justice to the double-helix structure, but is nevertheless adequate to
understand the DNA manipulations described in the other modules.
Cleavage of DNA always generates one end with a phosphate group coupled to the fifth carbon atom in
the ribose sugar ring (the 5'-phosphate) and the other end with a hydroxyl group attached to the third
carbon atom in the ribose sugar ring (the free 3'-OH group):
5'-phosphate________________________3'-OH
5|Page
Molecular Biology: General theory
RNA Structure
RNA is a very labile molecule and, like DNA, it is made from a long chain of nucleotide units. Each
nucleotide consists of a nitrogenous base, a ribose sugar, and a phosphate. Note that the sugar
present in a DNA molecule is deoxyribose, while the sugar present in a molecule of RNA is ribose (in
deoxyribose there is no hydroxyl group attached to the second carbon atom in the ribose sugar ring). The
extra hydroxyl group makes RNA more prone to hydrolysis which is why it is less stable than DNA. The
four bases found in RNA are adenine (A), cytosine (C), guanine (G) and uracil (U) (replacing the thymine
(T) in DNA).
Figure 3 Deoxyribose versus ribose sugars
Although RNA is essentially single-stranded, most biologically active RNA molecules contain selfcomplementary sequences that allow parts of the molecule to fold and pair with itself to form double
helices. There are three major types of RNA: messenger RNA (mRNA), transfer RNA (tRNA) and
ribosomal RNA (rRNA). There are a number of other types of RNA present in smaller quantities as well,
including small nuclear RNA (snRNA), small nucleolar RNA (snoRNA) and the 4.5S signal recognition
particle (SRP) RNA. Novel species of RNA continue to be identified.
GENOME ORGANIZATION
All of the genetic information or hereditary material possessed by an organism is known as its genome.
Viruses, prokaryotes and eukaryotes contain nucleic acids, with the relative amount of DNA and/or RNA
differing depending on the organism. Viruses have the smallest genomes. Prokaryotes have larger
genomes than viruses, while eukaryotes have the largest genomes of all.
6|Page
Molecular Biology: General theory
Viruses
Viruses contain either DNA or RNA, never both, have no cytoplasm or cell organelles and are obligate
intracellular parasites.
Viral genomes
The nucleic acid may be single-stranded or double-stranded; it may be linear, circular or
segmented configuration. Single-stranded virus genomes may be:

Positive (+) sense, i.e. of the same polarity (nucleotide sequence) as mRNA

Negative (-) sense or

Ambisense - a mixture of the two.
Prokaryotes
There are different groups of prokaryotes, distinguished from one another by characteristic genetic and
biochemical features:

The bacteria, which include most of the commonly encountered prokaryotes such as the Gramnegatives (e.g. Escherichia coli), the Gram-positives (e.g. Bacillus subtilis), the cyanobacteria (e.g.
Anabaena) and many more.

The archaea, which are less well-studied, and have mostly been found in extreme environments
such as hot springs, brine pools and anaerobic lake bottoms.
Prokaryotic genomes
The traditional view has been that an entire prokaryotic genome is contained in a single circular
DNA molecule, localized within the nucleoid - the lightly staining region of the otherwise featureless
prokaryotic cell. As well as this single ‘chromosome’, prokaryotes may also have additional genes
on independent smaller, circular or linear DNA molecules called plasmids. However, this traditional
view of the prokaryotic genome has been biased by the extensive research on E. coli, which has
been accompanied by the mistaken assumption that E. coli is a typical prokaryote. In fact,
prokaryotes display a considerable diversity in genome organization, some having a unipartite
genome, like E. coli, but others being more complex. Borrelia burgdorferi B31, for example, has a
linear chromosome of 911 kb, accompanied by 17 or 18 linear and circular molecules, which
together contribute another 533 kb. Multipartite genomes are now known in many other bacteria
and archaea.
7|Page
Molecular Biology: General theory
Eukaryotes
Animals, plants, fungi, and protists are eukaryotes, organisms whose cells are organized into complex
structures enclosed within membranes. The defining membrane-bound structure which differentiates
eukaryotic cells from prokaryotic cells is the nucleus.
Eukaryotic genomes
All of the eukaryotic nuclear genomes that have been studied are divided into two or more linear
DNA molecules, each contained in a different chromosome. All eukaryotes also possess smaller,
usually circular, mitochondrial genomes. Plants and other photosynthetic organisms possess a
third genome, located in the chloroplasts or plastids.
There is a large amount of DNA which does not code for proteins and which varies widely in
amount between species. In vertebrates only around 10% of the genome codes for proteins
(exons) - much of the remaining 90% has no obvious function and may in fact be simply junk
(introns). The differences in genome sizes within a class of organisms are almost entirely due to
differences in the amount of this non protein coding DNA. Much of the non protein coding DNA is
highly repetitive. The repetitiveness of DNA can be used as a way of classifying the different types
of DNA which make up the eukaryotic genome.
Figure 4 Basic structures of eukaryotic and prokaryotic cells
8|Page
Molecular Biology: General theory
THE CENTRAL DOGMA OF MOLECULAR BIOLOGY
The central dogma of molecular biology states that the flow of genetic information is “DNA to RNA to
PROTEIN”
There are four major stages represented by this dogma.
1. During replication, the DNA replicates its information with the use of several enzymes.
2. Transcription is the synthesis of a single-stranded RNA from a double-stranded DNA template.
RNA synthesis occurs in a 5’→ 3’ direction and its sequence corresponds to that of the DNA
strand, which is known as the sense strand.
3. In eukaryotic cells, the messenger RNA (mRNA) synthesised during transcription is processed
by splicing and migrates to the cytoplasm.
4. During translation, the processed mRNA carries coded information to ribosomes in the
cytoplasm. The ribosomes then read the information on the mRNA template and link amino acids
in the prescribed order.
For the purposes of this course, the section on proteins will be excluded; however a short
description on translation will be given before the conclusion of this chapter.
Figure 5 In vivo transcription of DNA to RNA and the translation of RNA to protein
(Adapted from http://www.emc.maricopa.edu/faculty/farabee/BIOBK/BioBookPROTSYn.html )
9|Page
Molecular Biology: General theory
DNA REPLICATION
DNA replication is the process of duplicating the DNA sequence in the parent strand to produce an exact
replica (daughter strand). Replication is semi-conservative: each one of the two parental strands serves
as a template for the new strand synthesis; therefore, duplicated double helices contain one parental
strand and one daughter strand. DNA polymerases are the enzymes responsible for DNA synthesis.
These enzymes use a single-stranded DNA template to catalyze the polymerization of a complementary
DNA strand.
In a cell, DNA replication must happen before cell division can occur. DNA synthesis begins at specific
locations in the genome, called "origins", where the two strands of DNA are separated. RNA primers
attach to single stranded DNA and the enzyme DNA polymerase extends the primers to form new
strands of DNA, adding nucleotides matched to the template strand. The unwinding of DNA and synthesis
of new strands forms a replication fork. In addition to DNA polymerase, a number of other proteins are
associated with the fork and assist in the initiation and continuation of DNA synthesis.
DNA replication can also be performed artificially, using the same enzymes used within the cell. DNA
polymerases and artificial DNA primers are used to initiate DNA synthesis at known sequences in a
template molecule. The polymerase chain reaction (PCR), a common laboratory technique, employs
artificial synthesis in a cyclic manner to rapidly and specifically amplify a target DNA fragment from a pool
of DNA.
The replication process in bacteria
The chromosome of a prokaryote is a circular molecule of DNA. Replication begins at one origin of
replication and proceeds in both directions around the chromosome. The basic steps of DNA replication
are:

Initiation – replication begins at an origin of replication

Elongation – new strands of DNA are synthesized by DNA polymerase

Termination – replication is terminated differently in prokaryotes and eukaryotes
Initiation
This step involves the assembly of a replication fork (bubble) at an origin of replication sequence of
DNA found at a specific site of the circular chromosome of a bacterium. This origin of replication is
unwound, and the partially unwound strands form a "replication bubble", with one replication fork on
either end. Each group of enzymes at the replication fork moves away from the origin, unwinding
and replicating the original DNA strands as they proceed. The factors involved are collectively
called the pre-replication complex. It consists of the following:
10 | P a g e
Molecular Biology: General theory

A helicase, which unwinds and splits the DNA ahead of the fork. Thereafter, single-strand
binding proteins (SSB) swiftly bind to the separated DNA, thus preventing the strands from
reuniting. Because DNA is helical, this unwinding means that the DNA needs to rotate to avoid
too much supercoiling. An enzyme, named DNA topoisomerase I, precedes the replication
complex, and cleaves one strand of DNA ahead of the replication machinery, allowing free
rotation of the DNA between the nick and the replication complex.

A primase (an RNA polymerase), which generates an RNA primer that serves as a starting
point or primer for synthesis of the new DNA chain.

A DNA polymerase III holoenzyme, which in reality is a complex of enzymes that together
perform the actual replication.
Elongation
After the helicase unwinds the DNA, single-strand binding protein is used to hold the DNA strands
apart. RNA primase is then bound to the starting DNA site, and it synthesizes an RNA primer
complementary to the DNA.
At the beginning of replication, an enzyme called DNA polymerase III holoenzyme binds to the
RNA primer, which indicates the starting point for the replication.

DNA polymerase can only synthesize new DNA from the 5’ to 3’ (of the new DNA - i.e. it
moves on a template from 3’ to 5’). Because of this, the DNA polymerase can only travel on
one of the original DNA strands without any interruption. This original strand, which goes from
3’ to 5’, is called the leading strand (Fig. 17). The complement of the leading strand, from 5’
to 3’, is the lagging strand.

The holoenzyme catalyses the formation of a phosphodiester bond between the 3’-OH group
of the last sugar on a chain of DNA (the primer), and the 5’ phosphate group on an incoming
nucleotide triphosphate, chosen for its complementarity to the facing nucleotide present on the
template strand (the one being copied).
11 | P a g e
Molecular Biology: General theory
Figure 6 The replication fork
Replication of the lagging strand is more complicated than that of the leading strand. The
orientation of the lagging strand is opposite to the working orientation of DNA polymerase III
(which can only synthesize new DNA from the 5’ to 3’). As a result, the DNA of the lagging strand
is replicated in a piecemeal fashion. The primase, which accompanies the holoenzyme,
synthesizes RNA primers along the lagging strand every few hundreds of base pairs, which are
then used as primers for DNA polymerase III action. Small stretches of DNA (so-called Okazaki
fragments, ~2 kb), are synthesized until the next RNA primer which prevents DNA polymerase III
activity. Then DNA polymerase I enters into action, digesting the RNA primer, and replacing
ribonucleotides with deoxyribonucleotides. Finally DNA ligase generates the last phosphodiester
bond between two newly synthesized Okazaki fragments.
Figure 7 DNA replication in bacteria
(Taken from: http://commons.wikimedia.org/wiki/Image:DNA_replication_editable.svg )
12 | P a g e
Molecular Biology: General theory
DNA Helicase is an enzyme that unravels the DNA double helix and breaks the hydrogen
bonds.
DNA primase is an enzyme that generates an RNA sequence that serves as a starting point or
primer for synthesis of the new DNA chain.
DNA polymerases are enzymes that synthesize a complementary DNA strand using the original
strand as a template. In DNA synthesis, the new strand grows 5' to 3'.
An Okazaki fragment is a stretch of non-parental DNA produced along the lagging strand of
parental DNA by the DNA polymerase beginning at primer.
Three DNA polymerases are found in bacteria
1) DNA polymerase I (Pol I) functioning essentially in DNA repair, and a little in DNA replication.
DNA polymerase I possesses three enzymatic activities:

5' to 3' (forward) DNA polymerase activity, requiring a 3' primer site and a template
strand. DNA polymerase I catalyzes the addition of nucleotides to the 3’ hydroxyl of
primer DNA and requires dNTPs and Mg2+. Synthesis is always in the 5’ to3’ direction.

A 5’ to 3’ (forward) exonuclease activity that mediates “nick translation” during DNA
repair.

A 3’ to 5’ (reverse) exonuclease activity that allows “proofreading” and mediates
removal of RNA primers during replication.
2) DNA polymerase II (Pol II) is a DNA repair enzyme involved in replication of damaged DNA.
It has 3' to 5' exonuclease activity that mediates proofreading. DNA polymerase II differs from
DNA polymerase I in that it lacks 5' to 3' exonuclease activity and cannot use a nicked duplex
template.
3) DNA polymerase III (Pol III) is the main enzyme used in DNA replication; it is a complex of
several proteins (at least 20), forming the so-called holoenzyme. DNA polymerase III has 5' to 3'
(forward) DNA polymerization activity and catalyzes the addition of dNTPs to the end of a new
DNA strand with release of inorganic pyrophosphate (PPi). It also has 3' to 5' exonuclease
proofreading capability. The holoenzyme has high processivity, i.e. the number of nucleotides
added per binding event. The rate of DNA synthesis is extremely fast: 30-60,000 nucleotides per
minute.
13 | P a g e
Molecular Biology: General theory
Termination
Because bacteria have circular chromosomes, termination of replication occurs when the two
replication forks meet each other on the opposite end of the parental chromosome. Termination of
DNA replication in E. coli is regulated through the use of termination sequences (Ter sites) and the
Tus protein (terminator utilization substance protein). The Tus protein binds to the termination
sites, these Tus -Ter complexes allow a replication fork to pass through in one direction, but not the
other. Multiple Ter sites are located in the termination region of the E. coli chromosome and slow
down and stop the movement of the replication forks in this region. As a result, the replication forks
are constrained to always meet and terminate within the termination region of the chromosome.
Fidelity of DNA replication
The error rate in DNA replication is very low and has been estimated at 1 error in 10 million bases.
DNA polymerase III proofreads the newly made strand before continuing with replication. When an
incorrect nucleotide is incorporated, the 3’ end will be frayed. The DNA polymerase III recognizes
the problem, "backs up" and removes the incorrect nucleotide by means of its 3' to 5' exonuclease
activity. The correct nucleotide is then added to the chain and elongation is resumed.
Note: DNA polymerase III present in the replication fork has three important properties:
(i) Chain elongation (ii) Processivity and (iii) Proofreading
Catalytic activities of DNA polymerases:
Important note: DNA polymerases I, II and III carry both a 5’ to 3’ polymerase activity, and a 3’ to 5’
exonuclease activity which eliminates from an elongating chain of DNA any misincorporated (non
complementary) nucleotides. This is known as proofreading activity.

5' to 3' polymerase activity for DNA synthesis (DNA Pol I, II and III)

5' to 3' exonuclease activity for removal of the DNA/RNA primer

(Allows DNA Pol I to destroy a strand of DNA or RNA located ahead of it (3’ to it) on the DNA
template).

3' to 5' exonuclease activity for proofreading (DNA Pol I, II and III)
The replication process in viruses
Viral populations do not grow through cell division, because they are acellular; instead, they use the
machinery and metabolism of a host cell to produce multiple copies of themselves. In many ways the
14 | P a g e
Molecular Biology: General theory
replication is very similar to that of bacteria but the expression of virus genetic information is dependent
on the structure of the genome of the particular virus concerned. In every case, the genome must be
recognized and expressed using the mechanisms of the host cell.
The replication process in Eukaryotes
The DNA replication process in eukaryotes is essentially the same as in prokaryotes, with some
differences. While replication begins at ori C in prokaryotes, there are multiple origins of replication in
eukaryotes due to the sheer size of the chromosomes. Replication begins at specific places on the
chromosome - the origin or "ori" region and proceeds bidirectionally.
As in prokaryotes, the two parent strands are unwound with the help of DNA helicases. Stabilizing
proteins attach to the unwound strands, preventing them from winding back together. DNA polymerases
α and δ are responsible for DNA synthesis in eukaryotes. DNA polymerase α begins with an RNA primer
and then adds a few DNA nucleotides. DNA polymerase δ takes over and synthesizes DNA nucleotides
at approximately 100 times the rate of DNA polymerase α. RNA primers, needed repeatedly on the
lagging strand to facilitate synthesis of Okazaki fragments, are synthesized by subunits of DNA
polymerase α. DNA Polymerases β and ξ are presumed to be involved in DNA repair in eukaryotes
which may indicate that they are involved in RNA primer removal. Finally, as in prokaryotes, each new
Okazaki fragment is attached to the completed portion of the lagging strand in a reaction catalyzed by
DNA ligase.
Eukaryotic nuclear chromosomes are packaged by histone proteins into a condensed structure called
chromatin. The replication of eukaryotic chromosomes presents additional problems, including the need
to remove, replicate and replace histones and other similar proteins associated with the DNA double
helix. In addition, eukaryotes must be able to deal with the linear ends of each chromosome arm (the
telomeres), i.e. once the RNA primer is removed at the end of each arm of a chromosome, there is no 3’OH end for the DNA polymerase to recognize and use to replace missing nucleotides. This function is
accomplished by telomerase, an enzyme that adds specific DNA sequence repeats ("TTAGGG" in all
vertebrates) to the 3’ end of DNA strands in the telomere regions. This enzyme is a reverse transcriptase
that carries its own RNA molecule, which is used as a template when it elongates telomeres.
15 | P a g e
Molecular Biology: General theory
Figure 8 The replication of DNA in Eukaryotes
TRANSCRIPTION
Transcription is the synthesis of RNA under the direction of DNA. Messenger RNA (mRNA) is
synthesized by transcription or copying of DNA, a process similar to DNA replication. The DNA sequence
is copied by RNA polymerase to produce a complementary RNA strand, called messenger RNA
(mRNA), because it carries a genetic message from the DNA to the protein-synthesizing machinery of the
cell. Other types of transcribed RNA, such as transfer RNA, ribosomal RNA, and small nuclear RNA are
not necessarily translated into an amino acid sequence.
Unlike DNA replication, transcription does not need a primer to start. RNA polymerase simply binds to the
DNA and, along with other cofactors, unwinds the DNA to create an initiation bubble and the bases on the
two strands are exposed. But how does the RNA polymerase know where to begin? The starting point of
a gene is marked by a certain base sequence which is called a promoter site. In prokaryotes,
transcription begins with the binding of RNA polymerase to the promoter on the DNA molecule. At the
start of initiation, the RNA polymerase is associated with a sigma factor that aids in finding the
appropriate additional “signaling” base pairs downstream of the promoter sequences. Transcription
initiation is far more complex in eukaryotes, the main difference being that eukaryotic RNA polymerases
do not directly recognize their promoter sequences. In eukaryotes, a collection of proteins called
transcription factors mediate the binding of RNA polymerase and the initiation of transcription.
As in DNA replication, RNA is synthesized in the 5' to 3' direction (from the point of view of the growing
RNA transcript). Only one of the two DNA strands is transcribed into mRNA (remember that RNA is a
single-stranded molecule), unlike DNA replication, where both strands are copied. The DNA strand that is
16 | P a g e
Molecular Biology: General theory
transcribed is called the template strand (also known as the antisense strand), while its complement is
called the informational strand (also called the coding or sense strand).
Since the template strand and the informational strand are complementary, and since the template strand
and the mRNA molecule are also complementary, it follows that the messenger RNA molecule produced
during transcription is a copy of the DNA informational strand. Unlike DNA replication, mRNA transcription
can involve multiple RNA polymerases on a single DNA template and multiple rounds of transcription
resulting in amplification of a particular mRNA, i.e. many mRNA molecules can be produced from a single
copy of a gene. This step also involves a proofreading mechanism that can replace incorrectly
incorporated bases.
In bacteria, just as there is a sigma factor to help signal the beginning of a gene, another factor called
"rho" aids in terminating the process of transcription. When the end of the gene is near, the rho factor
binds to the mRNA, destabilizing the interaction between the template and the mRNA. This releases the
newly synthesized mRNA from the elongation complex, thus stopping transcription. An alternative
strategy for transcription termination in bacteria is known as rho-independent transcription termination.
RNA transcription stops when the newly synthesized RNA molecule forms a G-C rich hairpin loop,
followed by a run of Us, which makes it detach from the DNA template. Transcription termination in
eukaryotes is less well understood. It involves cleavage of the new transcript, followed by templateindependent addition of As at its new 3' end, in a process called polyadenylation.
The stretch of DNA that is transcribed into an RNA molecule is called a transcription unit. A transcription
unit that is translated into protein contains sequences that direct and regulate protein synthesis in addition
to coding the sequence that is translated into protein. The regulatory sequence that is before, or 5', of the
coding sequence is called the 5' untranslated region (5’UTR), and the sequence found following, or 3', of
the coding sequence is called the 3' untranslated region (3’UTR). Transcription has a lower copying
fidelity than DNA replication since, although there are some proofreading mechanisms, they are fewer
and less effective than the controls for copying DNA.
17 | P a g e
Molecular Biology: General theory
Figure 9 Transcription of RNA from DNA
(Adapted from: http://biologysemester58.wikispaces.com/Molecular+Genetics )
Prokaryotes
Prokaryotic transcription occurs in the cytoplasm alongside translation. Prokaryotes do not have exons
and introns and an RNA molecule corresponding to the DNA molecule is produced by RNA transcription.
In prokaryotes, mRNA is not modified.
Eukaryotes
Eukaryotic transcription occurs in the nucleus, where it is separated from the cytoplasm by the nuclear
membrane. The mRNA transcript is then transported into the cytoplasm where translation occurs.
Eukaryotic DNA is wound around histones to form nucleosomes and packaged as chromatin. Chromatin
has a strong influence on the accessibility of the DNA to transcription factors and the transcriptional
machinery including RNA polymerase.
In most mammalian cells, only 1% of the DNA sequence is copied into a functional RNA (mRNA).
Eukaryotic mRNA is modified through RNA splicing, 5' end capping, and the addition of a polyA tail. One
of the most important stages in RNA processing is RNA splicing. In many genes, the DNA sequence
coding for proteins, or "exons", may be interrupted by stretches of non-coding DNA, called "introns". In
18 | P a g e
Molecular Biology: General theory
the cell nucleus, the DNA that includes all the exons and introns of the gene is first transcribed into a
complementary RNA copy called "nuclear RNA," or nRNA. In a second step, introns are removed from
nRNA by a process called RNA splicing. The edited sequence is called "messenger RNA," or mRNA.
Only one part of the DNA is transcribed to produce nuclear RNA, and only a minor portion of the nuclear
RNA survives the RNA processing steps.
The mRNA leaves the nucleus and travels to the cytoplasm, where it encounters cellular bodies called
ribosomes. The mRNA, which carries the gene's instructions, dictates the production of proteins by the
ribosomes in a process known as translation.
TRANSLATION
Translation is the process of converting the mRNA sequence into an amino acid sequence. It occurs in
the cytoplasm where the ribosomes are located. Ribosomes are made of a small and a large subunit
which surround the mRNA. In translation, an mRNA sequence is used by the ribosome as a template to
guide the synthesis of a chain of amino acids.
DNA transfers information to mRNA in the form of a code defined by the sequence of nucleotide bases.
Since DNA and RNA are constructed from four types of nucleotides, there are 64 possible triplet
sequences or codons (4x4x4); many more than the 20 needed to specify the common amino acids
present in nature. Three of the possible codons specify the termination of the polypeptide chain. They are
called "stop codons". That leaves 61 codons to specify only 20 different amino acids. Therefore, most of
the amino acids are represented by more than one codon. The genetic code is therefore said to be
degenerate.
The vast majority of genes are encoded with exactly the same code, known as the canonical genetic
code, or simply the genetic code. In fact there are many variant codes; so it should be noted that the
canonical genetic code is not universal. For example, in humans, protein synthesis in mitochondria relies
on a genetic code that varies from the canonical code.
During protein synthesis, a ribosome moves along an mRNA molecule from the 5' end to the 3' end and
"reads" its sequence three nucleotides at a time (codon). Each amino acid is specified by the mRNA's
codon, which pairs with a sequence of three complementary nucleotides (anticodon) carried by a
particular transfer RNA (tRNA) molecule. The other end of the tRNA has the amino acid attached to the
3'-OH group via an ester linkage. A tRNA molecule with an attached amino acid is said to be "charged".
19 | P a g e
Molecular Biology: General theory
Figure 10 Structure of the charged transfer RNA (tRNA) molecule
When a small subunit of a ribosome charged with a tRNA + methionine (initiator tRNA) encounters an
mRNA, it attaches and starts to scan for a start signal or start codon (AUG). When it finds the start
sequence AUG, the codon for methionine, the large subunit joins the small one to form a complete
ribosome and protein synthesis is initiated. A new charged tRNA (tRNA + amino acid) enters the
ribosome, at the next codon downstream of the AUG codon. If its anticodon matches the mRNA codon it
binds and the ribosome can link the two amino acids together (Note: if a tRNA with the wrong anticodon
(and therefore the wrong amino acid) enters the ribosome, it cannot bind with the mRNA and is rejected).
The ribosome then moves one triplet forward and a new charged tRNA can enter the ribosome and the
procedure is repeated. When the ribosome reaches one of three stop codons, UAG, UAA or UGA, there
are no corresponding tRNAs to that sequence. Instead termination proteins bind to the ribosome and
stimulate the release of the polypeptide chain (the protein), and the ribosome dissociates from the mRNA.
Figure 11 Process of DNA translation
(Adapted from http://www.duke.edu/web/MAT/jennifer_sohn/unit/translation_oh.htm)
20 | P a g e
Molecular Biology: General theory
REFERENCES
The relevant websites as indicated in the course outline:
1. http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=genomes
2. http://users.rcn.com/jkimball.ma.ultranet/BiologyPages/R/RecombinantDNA.html
3. http://en.wikipedia.org/wiki/DNA_replication
4. http://en.wikipedia.org/wiki/DNA_replication
5. http://www.biochem.ucl.ac.uk/bsm/prot_dna/family_descriptions/1ecr_single/1ecr_single.html
6. http://virology-online.com/general/Replication.htm
A few general references:
1.
PROMEGA: Protocols and applications guide. 1996
2.
HYDE, J 1990. Molecular parasitology. Van Nostrand Reinhold. New York
3.
BROWN, T. A 1986. Gene cloning. Van Nostrand Reinhold (UK) Co. Ltd.
4.
BROWN, T. A 1995 3rd Edition. Gene cloning – An introduction. Chapman & Hall, London.
5.
D’AQUILA, R.T., BECHTEL, L.J.,VITELER, J. A., ERON, J.J., GORCZYCZ, P. and KAPLIN, J.C.,
1991. Maximizing sensitivity and specificity of PCR by preamplification heating. Nucleic Acids Res.
19: 3749.
6.
DIEFFENBACH, C. W. and DVEKSLER, G. S. 1995. PCR primer. A laboratory manual. CSHL
Press.
7.
EHRLICH, H.A, GELFAND, D. and SNINSKY, J.J., 1991. Recent advances in the polymerase
chain reaction. Science 252: 1643 – 1651.
8.
KWOK, S. and HIGUCHI, R 1989. Avoiding false positives with PCR. Nature 339: 237 – 238.
9.
LONGO, M. C., BERNINGER, M.S. and HARTLEY J.L. 1990. Use of uracil DNA glycosylase to
control carryover contamination in polymerase chain reactions. Gene 93: 125 – 128.
10.
MAXAM, A. M and GILBERT, W 1980. Sequencing end-labelled DNA with base-specific chemical
cleavage. Methods in Enzymology, 65: 499 –552.
11.
MULLIS, K. B., 1991. The polymerase chain reaction in an anemic mode: How to avoid cold
oligodeoxy-ribonuclear fusion. PCR Methods Appl. 1: 1-4.
12.
SMITH C, A and WOOD E, J 1991. Molecular biology and biotechnology. Chapman & Hall,
London.
13.
SANGER, F. NICKLEN, S and COULSON, A. R 1979. DNA sequencing with chain-terminating
inhibitors. Proc. Natl Acad. Sci. USA 74, 5463 – 5467.
21 | P a g e
Download