Biological Sequences: DNA, RNA, Protein

advertisement
Biological Sequences: DNA, RNA, Protein
Nucleotides and Nucleic Acids
•
biological molecules that possess heterocyclic nitrogenous bases as principal components of their
structure
Biochemical roles of nucleotides are numerous
• nucleotides participate as essential intermediates in all aspects of cellular metabolism
• nucleic acids are linear polymers of nucleic acids i.e. polynucleotides linked by phosphodiester
bridges
• nucleic acids are elements of heredity and are involved in synthesis of proteins
An orderly sequence of nucleotide residues in a nucleic acid can encode information. The convention in all
notations of nucleic acid structure is to read the polynucleotide chain from the 5’-end to the 3’-end.
There are two basic kinds of nucleic acids
• ribonucleic acid (RNA)
• deoxyribonucleic acid (DNA)
Basic characteristics of DNA and RNA
• DNA has only one biological role, but it is a central one. The information to make all the functional
macromolecules of an organism (even DNA itself) is preserved in DNA and accessed through
transcription of the information into RNA copies. There is a single chromosome in the form of a
DNA molecule in simple life forms (e.g. bacteria). Eukaryotic cells have many chromosomes. In
addition to the nucleus, mitochondria and chloroplasts have their own DNA sequences that encode
for the proteins and RNAs unique to those organelles.
• RNA occurs in multiple copies and various forms. Cells contain as much as 8 times more RNA than
DNA material. RNA molecules are categorized into several major types: messenger RNA, ribosomal
RNA, and transfer RNA. Eukaryotic cells contain an additional type, small nuclear RNA.
1/10
DNA
•
•
•
•
•
•
a thread-like molecule
the DNA isolated from different cells and viruses consists of two polynucleotide strands wound
together to form a long, slender, helical molecule, the DNA double helix.
each DNA strand consists of four types of nucleotides: adenine (A), cytosine (C), guanine (G), and
thymine (T)
the strands run in the opposite directions, that is, they are antiparallel
the strands are held together in the double helical structure through interchain hydrogen bonds
the H-bonds pair the bases of nucleotides in one chain to complementary bases in the other (so-called
base pairing)
The first clue for the base pairing came by Erwin Chargaff in the late 1940s whose data showed that the four
bases commonly found in DNA do not occur in equimolar amounts and that relative amounts of each vary
from species to species. Chargaff noted that adenine and thymine, and glutamine and cytosine are always
found in a 1:1 ratio. That is, [A] = [T] and [C] = [G].
Watson and Crick’s Double Helix
•
•
•
proposed by James Watson and Francis Crick in
1953
there are three types of DNA molecules: A-DNA,
B-DNA, and Z-DNA
B-DNA (a form with the usual major and minor
groove) is preferred in vivo; A-DNA is
conformation that exists in vitro, while in Z-DNA
(zig-zag) the helix is left-handed
Size of DNA molecules is usually represented in terms of
nucleotide base pairs (e.g. E. Coli consists of ~4 million
base pairs)
2/10
DNA in the Form of Chromosomes
•
•
•
•
the single chromosome in prokaryotic cells is typically a circular DNA molecule, and it is associated
with very little protein
DNA molecules of eukaryotic cells are linear molecules and divided into many chromosomes; each
DNA sequence is accompanied with proteins
a class of arginine- and lysine-rich basic proteins interact ionically with the anionic phosphate groups
in the DNA backbone to form nucleosomes, structures in which the DNA double helix is wound
around a protein core consisting of four histone peptides
chromosomes also contain a varying mixture of other proteins so-called non-histone chromosomal
proteins, many of which are involved in regulating which genes in DNA are transcribed at any given
moment
3/10
RNA
•
•
consist of four types of nucleotides: adenosine (A), uracil (U), cytosine (C) and guanine (G)
in contrast to DNA, backbone consists of a ribose sugars (has an OH group in the sugar ring)
Messenger RNA (mRNA)
•
•
•
•
single-stranded macromolecule
synthesized during transcription
serves to carry the information (or message) that is encoded in the genes to the sites of protein
synthesis in the cell
because it is directly “transcribed” from the DNA, it is said that this is a DNA-like RNA – however,
only genetic units of DNA are transcribed into mRNA
•
in prokaryotes, a single mRNA encodes for one or more polypeptide chains
•
in eukaryotes, a single mRNA encodes for only one polypeptide chain; eukaryotic mRNA is much
more complex, it is synthesized in the nucleus in the form of a much larger precursor called
heterogeneous nuclear RNA (hnRNA)
hnRNA contain stretches of nucleotide sequence that have no protein-coding capacity (intervening
sequences)
•
Ribosomal RNA (rRNA)
•
•
•
fold into characteristic secondary structures as a consequence of intramolecular H-bond interactions
one or more rRNA molecules comprises one subunit of a ribosome
contains chemically modified nucleotides
Transfer RNA (tRNA)
•
•
serves as a carrier of amino acid residues for protein synthesis
fold into a characteristic secondary structure
tRNA structure →
Small Nuclear RNA (snRNA)
•
•
•
similar to both tRNA and rRNA, but identical to neither
located in the nucleus
their biological purpose is to help produce hnRNA into mature
mRNA (when going from the nucleus to the cytoplasm)
4/10
Splicing
•
•
•
•
modifications of the orginially transcribed RNA molecule
splicing represents a removal of certain pieces of the RNA molecule called “intervening sequences”
or introns
what remains is so-called “expressed sequences” or exons
a large majority of eukaryotic introns start with “GT” and ends with “AG”
So, only exons encode for proteins.
When a particular set of exons encode for more than one protein we call this phenomenon alternative
splicing. This means that one gene encodes for many proteins.
This form of encoding is very efficient and new proteins can evolve much faster than in prokaryotes.
5/10
Proteins
•
•
•
•
diverse and abundant class of macromolecules
constitute more than 50% of the dry weight of cells
play the central role in virtually all aspects of cell structure and function
called “machinery of life”
Proteins are linear polymers of amino acids.
Peptide classification
• two amino acid residues – dipeptide
• three amino acid residues – tripeptide
• four amino acid residue – tetrapeptide
• about 10 or more – oligopeptides
• about 20 or more – polypeptides
but the naming conventions are not precise.
What is a protein?
•
•
•
•
•
proteins are composed of one or more polypeptide chains
proteins composed of only one chain are called monomeric proteins (monomers)
proteins composed of more than one chain are called multimeric proteins
monomeric proteins may contain only one kind of protein chains when they get a preffix “homo” or
they may consist of various polypeptide chain when they get a prefix “hetero”
so, a protein consisting of two identical polypeptide chains would be called homodimer and a protein
consisting of 4 different chains is called heterotetramer (or heteromultimer)
Proteins consist of 20 amino acids (you should know them all!)
Architecture of Protein Molecules
Some of the forms are
• fibrous proteins – have a relatively simple regular structure; have structural roles in cells
• globular proteins – roughly spherical in shape; compact
• membrane proteins – found in association with the various membrane systems of cells
Levels of Protein Structure
•
•
•
•
primary structure – amino acid sequence
secondary structure – the local arrangement of amino acid residues that is a consequence of
interactions between adjacent residues (H-bonds); there are three broad categories here: helical
structure, sheet structure, and other types of structure (called “other” or “coil”)
tertiary structure – the spatial arrangement of secondary structure elements; the tertiary structure is
often refered to as protein 3-D conformation or shape
quaternary structure – complexes of polypeptide chains (called subunits) or
6/10
Notice that primary structure is determined by covalent bonds while higher order structures are
predominantly determined by weak interactions (of course, not always)
7/10
8/10
Biological Function of Proteins
Enzymes
• the largest class of proteins
• their main function is to accelerate the rates of biological reactions (as much as 1016 times)
• virtually every step in metabolism is catalyzed by an enzyme
Regulatory proteins
• regulate the ability of other proteins to perform their biological functions (e.g. insulin)
• regulate gene expression (usually bind to DNA and either activate of inhibit transcription – e.g.
repressors)
Transport proteins
• their function is to transport specific substances from one place to another (e.g. hemoglobin or serum
albumin)
• a different type of transport is performed by membrane proteins – these proteins take up metabolite
molecules on one side of the membrane, transport them accross the membrane, and release them on
the other side (form channels in the membrane through which the transported substances are passed)
Storage proteins
• their biological function is to provide a reservoir of an essential nutrient (casein is the major nitrogen
source for mammalian infants)
Contractile and motile proteins
• provide a cell with unique capabilities for movement
• cell division, muscle contraction, cell motility represent some of the ways in which cells execute
motion
• these proteins are filamentous or polymerize to form filaments (e.g. mysion)
• another class of proteins involved in motion is so-called motor proteins that drive the movement
vesicles, granules, and organelles
Structural proteins
• apparently passive, but very important role of proteins
• provide strngth and protection to cells and tissues
• monomeric units of structural proteins typically polymerize to generate long fibers (as in hair) or
protective sheets of fibrous arrays
• collagen is an important fibrous protein found in bones, connective tissue, tendons, cartilage, where it
forms inelastic fibers of great strength
• elastin is an important component of liganemts and it has elastic properties
• fibroin is the major constituent of spider web
Scaffold (adapter) proteins
• have recently discovered role in the complex cell response to hormones and growth factors
• possess modular organization in which specific parts of the protein’s structure recognize and bind
certain structural elements in other proteins through protein-protein interactions
• scaffold proteins act as a scaffold onto which a set of different proteins as assembled into a
multiprotein complex
• anchoring proteins bind other proteins, causing them to associate with other structures in the cell
9/10
Protective and exploitive proteins
• have a biologically active role in a cell defense or protection
• prominent member would be immunoglobulins (or antibodies) produced by the lymphocytes of
vertebrates – they recognize and neutralize “foreign” molecules resulting from the invasion of the
organism by bacteria, viruses, or other infectious agents
• blood-clotting proteins (e.g. thrombin) prevent the loss of blood when circulatory system is damaged
• antifreeze proteins protect blood of arctic/antarctic fish against freezing
• various toxins and defensive proteins (e.g. ricin deters predation by herbivores)
Exotic proteins
• do not quite fit previous classification
• monellin – found in African plant, has a very sweet taste and may be used as artificial sweetener
• glue proteins – secreted by some marine organisms that enables them to stick to hard surfaces
Many proteins have chemical groups other than amino acids. These proteins are termed conjugated proteins.
If the non-protein part is essential to protein function, it is referred to as a prosthetic group.
Some of the conjugated proteins are
• glycoproteins – contain carbohydrates
• lipoproteins – conjugated with lipids
• nucleoproteins
• phosphoproteins – have phosphate groups attached
• metalloproteins
Sequence Homology
•
•
in biology, two or more structures are said to be homologous if they are alike because of shared
ancestry
at a DNA or protein sequence level, homolgy is usually concluded when two sequences are similar.
Homologous sequences are formed by gene duplication (paralogs) and/or speciation (orthologs) during
evolution.
Important: distinguish similar sequences from homologous sequences. Two sequences are either homolous
or they are not homologous (binary decision). Sequence identity is used to measure how similar two
sequences are.
----Sources:
Biochemistry by Reginald H. Garrett and Charles M. Grisham
Fundamental concepts of Bioinformatics by Dan E. Krane and Michael L. Raymer
Internet
http://ntri.tamuk.edu/cell/nucleic.html (good for nucleotide information, has videos)
10/10
Download