00_BioBackground

advertisement
Cells, DNA, RNA and Proteins
Simplified!
1
Cells
• The fundamental unit of life is the cell
• A cell consists of a protective membrane surrounding a collection of
organelles (subcellular structures) and large and complex molecules that
provide cellular structure, energy, and the means for the cell to reproduce
• In plants and animals, individual cells cooperate to form multicellular
tissues and organ systems that meet the biological needs of the organism
• We are interested in biological sequences that regulate all biological
processes in cells and organisms
• Our primary concern are the instructions for the organization of cells
during the development of an organism
DNA
• The instruction sequences are stored in very long chemical strings called
DNA
• DNA is the main information carrier molecule in a cell
• DNA may be single or double stranded.
• A single stranded DNA molecule, also called a polynucleotide, is a chain of
small molecules, called nucleotides.
• There are four different nucleotides grouped into two types,
– purines: adenine and guanine and
– pyrimidines: cytosine and thymine.
• They are usually referred to as bases and denoted by their initial letters,
A, C, G and T
DNA
• Different nucleotides can be linked together in any order to form a
polynucleotide, for instance, like this
A-G-T-C-C-A-A-G-C-T-T
• Polynucleotides can be of any length and can have any sequence
• The two ends of this molecule are chemically different, i.e., the sequence
has a directionality, like this
A->G->T->C->C->A->A->G->C->T->T->
• The end of the polynucleotides are marked either 5' and 3' .
• By convention DNA is usually written with 5' left and 3' right, with the
coding strand at top.
DNA
• Two strands are said to be complementary if one can be obtained from the
other by
– mutually exchanging A with T and C with G, and
– changing the direction of the molecule to the opposite.
A->G->T->C->C->A->A->G->C->T->T->
<-T<-C<-A<-G<-G<-T<-T<-C<-G<-A<-A
DNA
• Specific pairs of nucleotides can form weak bonds between them
• A binds to T, C binds to G.
• Although such interactions are individually weak, when two longer
complementary polynucleotide chains meet, they tend to stick together
5' C-G-A-T-T-G-C-A-A-C-G-A-T-G-C 3'
| | | | | | | | | | | | | | |
3' G-C-T-A-A-C-G-T-T-G-C-T-A-C-G 5'
• Vertical lines between two strands represent the forces between them as
shown above.
• The A-T and G-C pairs are called base-pairs (bp).
• The length of a DNA molecule is usually measured in base-pairs or
nucleotides (nt), which in this context is the same thing.
DNA Double Helix
Two complementary polynucleotide chains
form a stable structure, which resembles a helix
known as a the DNA double helix.
About 10 bp in this structure takes a full turn,
which is about 3.4 nm long.
DNA
• It is remarkable that two complementary DNA polypeptides form a stable
double helix almost regardless of the sequence of the nucleotides
• This makes the DNA molecule a perfect medium for information storage
• Note that as the strands are complementary, either one of the strands of
the genome molecule contains all the informatiion
• Thus, for many information related purposes, the molecule used on the
example above can be represented as CGATTCAACGATGC
• The maximal amount of information that can be encoded in such a
molecule is therefore 2 bits times the length of the sequence
• Noting that the distance between nucleotide pairs in a DNA is about 0.34
nm, we can calculate that the linear information storage density in DNA is
about 6x10 8 bits/cm
• Which is approximately 75 GB
DNA
• Regions in the DNA sequence encode instructions for the manufacture of
proteins in the cell
• Proteins are linear chains whose elements come from a set of 20
chemically active building blocks known as amino acids.
• Each protein has a unique sequence of amino acids that is determined by a
DNA sequence on the chromosomes.
• The proteins enable an organism to build needed structures and to carry
out its biological functions.
• Using a specific biological mechanism – transcription – the DNA is “read”
and searched for specific patterns that mark the beginning and end of
hereditary information
• That information is the gene
RNA
•
Transcription produces another long string called messenger RNA (mRNA)
•
The mRNA is what actually specifies the amino acid sequence.
•
mRNA molecules are very similar structurally and chemically to DNA
•
Exceptions: they are single-stranded and have a new base – uracil (M) –
instead of thymine (T). It also has a different backbone sugar.
Translation
•
mRNA also has specific regions indicating the start of the code for a protein
•
Large organelles in the cytoplasm (ribosomes) bind to the start sites
•
Then move in a defined chemical direction , reading length-three base
sequences (codons) at a time
•
Each codon specifies an amino acid
•
The corresponding amino acid is then added to a growing chain that comprise
the protein
•
This continues until one of several stop codons is reached
Genetic Code
•
Dictionaries are the natural Python representation of tabular data.
•
Next time, we will illustrate this with a representation of the codon table for protein
synthesis
11
Transcription and Translation
• Once formed, proteins rapidly fold from a linear string into simple helical
and stranded elements
• These new components are then organized into a complex threedimensional structure
• The resulting protein molecule may serve as a tissue building block or have
a very specific chemical activity
• The collection of proteins produced by an organism, the proteome, is
responsible for the organism’s structure and biological behavior.
Download