A short introduction to biology

advertisement
A short introduction to biology
Life
• Two categories:
– Prokaryotes (e.g. bacteria)
• Unicellular
• No nucleus
– Eukaryotes (e.g. fungi, plant, animal)
• Unicellular or multicellular
• Has nucleus
Prokaryote vs Eukaryote
• Eukaryote has many membrane-bounded
compartment inside the cell
– Different biological processes occur at different
cellular location
Organism, Organ, Cell
Organism
Chemical contents of cell
• Water
• Macromolecules (polymers) - “strings” made by linking
monomers from a specified set (alphabet)
–Protein
–DNA
–RNA
–…
• Small molecules
–Sugar
–Ions (Na+, Ka+, Ca2+, Cl- ,…)
–Hormone
–…
DNA
• DNA: forms the genetic material of all
living organisms
– Can be replicated and passed to descendents
– Contains information to produce proteins
• To computer scientists, DNA is a string
made from alphabet {A, C, G, T}
– e.g. ACAGAACGTAGTGCCGTGAGCG
• Each letter is a nucleotide
• Length varies from hundreds to billions
RNA
• Historically thought to be information
carrier only
– DNA => RNA => Protein
– New roles have been found for them
• To computer scientists, RNA is a string
made from alphabet {A, C, G, U}
– e.g. ACAGAACGUAGUGCCGUGAGCG
• Each letter is a nucleotide
• Length varies from tens to thousands
Protein
• Protein: the actual “worker” for almost all processes in
the cell
–
–
–
–
–
Enzymes: speed up reactions
Signaling: information transduction
Structural support
Production of other macromolecules
Transport
• To computer scientists, protein is a string made from 20
kinds of characters
– E.g. MGDVEKGKKIFIMKCSQCHTVEKGGKHKTGP
• Each letter is called an amino acid
• Length varies from tens to thousands
DNA/RNA zoom-in
•
•
•
•
Commonly referred to as Nucleic Acid
DNA: Deoxyribonucleic acid
RNA: Ribonucleic acid
Found mainly in the nucleus of a cell (hence
“nucleic”)
• Contain phosphoric acid as a component (hence
“acid”)
• They are made up of a string of nucleotides
Nucleotides
• A nucleotide has 3 components
– Sugar ring (ribose in RNA, deoxyribose in
DNA)
– Phosphoric acid
– Nitrogen base
•
•
•
•
Adenine (A)
Guanine (G)
Cytosine (C)
Thymine (T) in DNA and Uracil (U) in RNA
Free
phosphate
5’
A
5 prime
3 prime
5’-AGCGACTG-3’
G
C
AGCGACTG
G
DNA
A
Often recorded from 5’ to 3’, which is the
direction of many biological processes.
e.g. DNA replication, transcription, etc.
C
T
G
3’
5
Phosphate
4
Base
1
Sugar
3
2
Free
phosphate
5’
A
5 prime
3 prime
5’-AGUGACUG-3’
G
U
AGUGACUG
G
RNA
A
Often recorded from 5’ to 3’, which is the
direction of many biological processes.
e.g. translation.
C
U
G
3’
5
Phosphate
4
Base
1
Sugar
3
2
5’
A
3’
Base-pair:
A=T
G=C
G
Forward (+) strand
5’-AGCGACTG-3’
3’-TCGCTGAC-5’
C
G
A
AGCGACTG
TCGCTGAC
C
Backward (-)
strand
One strand is said to be reversecomplementary to the other
T
G
3’
5’
DNA usually exists in pairs.
DNA double helix
G-C pair is stronger than A-T pair
RNA
• RNAs are normally singlestranded
• Form complex structure by selfbase-pairing
• A=U, C=G
• Can also form RNA-DNA and
RNA-RNA double strands.
– A=T/U, C=G
Protein zoom-in
• Protein is the actual “worker” for almost all processes in
the cell
• A string built from 20 kinds of chars
– E.g. MGDVEKGKKIFIMKCSQCHTVEKGGKH
• Each letter is called an amino acid
Side chain
R
|
H2N--C--COOH
|
Carboxyl group
Amino group
H
Generic chemical form of amino acid
Units of Protein: Amino acid
• 20 amino acids, only differ at side chains
– Each can be expressed by three letters
– Or a single letter: A-Y, except B, J, O, U, X, Z
– Alanine = Ala = A
– Histidine = His = H
Amino acids => peptide
R
|
H2N--C--COOH
|
H
R
|
H2N--C--COOH
|
H
R
R
|
|
H2N--C--CO--NH--C--COOH
|
|
H
H
Peptide bond
Protein
R
H2N
R
R
R
R
R
…
N-terminal
•
•
•
•
COOH
C-terminal
Has orientations
Usually recorded from N-terminal to C-terminal
Peptide vs protein: basically the same thing
Conventions
– Peptide is shorter (< 50aa), while protein is longer
– Peptide refers to the sequence, while protein has 2D/3D structure
Genome and chromosome
• Genome: the complete DNA sequences in
the cell of an organism
– May contain one (in most prokaryotes) or
more (in eukaryotes) chromosomes
• Chromosome: a single large DNA
molecule in the cell
– May be circular or linear
– Contain genes as well as “junk DNAs”
– Highly packed!
Formation of chromosome
Formation of chromosome
50,000 times shorter than extended DNA
The total length of DNA present in one adult human is the
equivalent of nearly 70 round trips from the earth to the sun
Gene
• Gene: unit of heredity in living organisms
– A segment of DNA with information to make a
protein or a functional RNA
Some statistics
Chromosomes Bases
Genes
Human
46
3 billion
20k-25k
Dog
78
2.4 billion ~20k
Corn
20
2.5 billion 50-60k
Yeast
16
20 million ~7k
E. coli
1
4 million
Marbled
lungfish
?
130 billion ?
~4k
Human genome
• 46 chromosomes: 22 pairs + X + Y
1 from mother, 1 from father
• Female: X + X
• Male: X + Y
Central dogma of molecular biology
DNA Replication
• The process of copying a double-stranded
DNA molecule
– Semi-conservative
5’-ACATGATAA-3’
3’-TGTACTATT-5’

5’-ACATGATAA-3’
5’-ACATGATAA-3’
3’-TGTACTATT-5’ 3’-TGTACTATT-5’
p p p
Nucleotide
triphosphate
(dNTP)
• Mutation: changes in DNA base-pairs
• Proofreading and error-correcting mechanisms
exist to ensure extremely high fidelity
Central dogma of molecular biology
Transcription
• The process that a DNA sequence is
copied to produce a complementary RNA
– Called message RNA (mRNA) if the RNA
carries instruction on how to make a protein
– Called non-coding RNA if the RNA does not
carry instruction on how to make a protein
– Only consider mRNA for now
• Similar to replication, but
– Only one strand is copied
Transcription
(where genetic information is stored)
DNA-RNA pair:
A=U, C=G
T=A, G=C
(for making mRNA)
Coding strand:
5’-ACGTAGACGTATAGAGCCTAG-3’
Template strand: 3’-TGCATCTGCATATCTCGGATC-5’
mRNA:
5’-ACGUAGACGUAUAGAGCCUAG-3’
Coding strand and mRNA have the same sequence, except
that T’s in DNA are replaced by U’s in mRNA.
Translation
• The process of making proteins from mRNA
• A gene uniquely encodes a protein
• There are four bases in DNA (A, C, G, T), and four in
RNA (A, C, G, U), but 20 amino acids in protein
• How many nucleotides are required to encode an amino
acid in order to ensure correct translation?
– 4^1 = 4
– 4^2 = 16
– 4^3 = 64
• The actual genetic code used by the cell is a triplet.
– Each triplet is called a codon
The Genetic Code
Third
letter
Translation
• The sequence of codons is translated to a
sequence of amino acids
• Gene: -GCT TGT TTA CGA ATT• mRNA: -GCU UGU UUA CGA AUU • Peptide: - Ala - Cys - Leu - Arg - Ile –
• Start codon: AUG
– Also code Met
– Stop codon: UGA, UAA, UAG
Translation
• Transfer RNA (tRNA) – a different type of RNA.
– Freely float in the cell.
– Every amino acid has its own type of tRNA that binds
to it alone.
• Anti-codon – codon binding crucial.
tRNA-Pro
Anti-codon
Nascent peptide
tRNA-Leu
mRNA
Transcriptional regulation
Transcription factor
RNA Polymerase
Transcription starting site
promoter
•
•
•
gene
Will talk more in later lectures
RNA polymerase binds to certain location on promoter to initiate
transcription
Transcription factor binds to specific sequences on the promoter to regulate
the transcription
– Recruit RNA polymerase: induce
– Block RNA polymerase: repress
– Multiple transcription factors may coordinate
Splicing
promoter
Transcription starting site
gene
transcription
Pre-mRNA
• Pre-mRNA needs to be “edited” to form mature mRNA
• Will talk more in later lectures.
intron
intron
Pre-mRNA
5’ UTR exon
exon 3’ UTR
exon
Splicing
Mature mRNA
(mRNA)
Open reading
frame (ORF)
Start codon
Stop codon
Summary
•
DNA: a string made from {A, C, G, T}
– Forms the basis of genes
– Has 5’ and 3’
– Normally forms double-strand by reverse complement
•
RNA: a string made from {A, C, G, U}
–
–
–
–
–
•
Protein: made from 20 kinds of amino acids
–
–
–
–
•
mRNA: messenger RNA
tRNA: transfer RNA
Other types of RNA: rRNA, miRNA, etc.
Has 5’ and 3’
Normally single-stranded. But can form secondary structure
Actual worker in the cell
Has N-terminal and C-terminal
Sequence uniquely determined by its gene via the use of codons
Sequence determines structure, structure determines function
Central dogma: DNA transcribes to RNA, RNA translates to Protein
– Both steps are regulated
Download