Introduction to Molecular Biology - University of Louisiana at Lafayette

advertisement
Introduction to Molecular Biology
Raja Logananatharaj
Center for Advanced Computer Studies
University of Louisiana at Lafayette
Chromosome Structure
• Most plant and animal cells are diploid – they
have two sets of chromosomes
Human Metaphase Chromosomes
•Most plant and animal gametes are haploid –
they have one set of chromosomes
•In humans, there are 23 chromosomes per
haploid set:
22 autosomes plus 1 sex chromosome
(X or Y)
•There are about 100,000 gene loci on these
chromosomes
Spectral karyotype: metaphase
A Single Chromatid
Long Arm
Centromere
Short Arm
A Pair of Homologous Chromosomes
Genes, loci, alleles and
chromosomes
• Each chromatid contains a single, very long
piece of DNA
• Each gene is a small section of this DNA
• A gene locus is the place where a gene is
located
• An allele is a particular variety of a gene,
different alleles have different DNA
sequences.
Human
chromosome 6
Macromolecules
• Large molecules made up of chains of
smaller molecules
• Also called biopolymers
• Macromolecules of special interest
– Deoxyribonucleic Acid (DNA)
– Ribonucleic Acid (RNA)
– Polypeptides (including proteins)
Deoxyribonucleic Acid
DNA is made from smaller molecules
phosphoric acid (also called “phosphate”)
deoxyribose (a sugar)
nitrogenous bases - four types:
adenine
thymine
cytosine
guanine
Building Blocks of DNA
Nitrogenous Bases
Purines
Pyrimidines
Nitrogenous Bases
• The bases are compounds known as
Purines and Pyrimidines
• Purines have two rings
• Pyrimidines have one ring
• The rings make these very flat
structures
Purines: Adenine and Guanine
The small molecules form larger units
• Nucleosides – deoxyribose plus a
nitrogenous base
– Deoxyadenosine
– Deoxycytidine
– Deoxyguanosine
– Deoxythymidine
Pyrimidines Thymine and Cytosine
Nucleotides
• Phosphate + Sugar + Base
• Because there are 4 types of bases,
there are 4 types of nucleotides
• Nucleotides are the basic unit of DNA
and RNA structure
• Strings of nucleotides make up strands
of DNA or RNA
Nucleotide
Phosphate + Deoxyribose + one of the four bases
Bases are attached
to the side of the
sugar-phosphate backbone
For each base, an N is bound
to a C-1 in a deoxyribose
Deoxythymidylic acid
Double Helix
• A DNA double helix has two strands
that run in opposite directions
(antiparallel)
• The two strands in the double helix are
held together by hydrogen bonds
between the bases.
Here 3 nucleotides are joined
to make one strand
of DNA
Nitrogenous Bases Highlighted
Base Pairing Rules
• Each of the four bases must be paired
with a specific complementary base in
the opposite strand
• A (Adenine) and T (Thymine) are
complements
• C (Cytosine) and G (Guanine) are
complements
Adenine - Thymine Nucleotide Pair
Pairs of Nitrogenous Bases
Two Strands of DNA are Held Together
by Hydrogen Bonds between the bases
Ribonucleic Acid Structure (RNA)
• Same overall structure
• In RNA
– ribose instead of deoxyribose
– uracil instead of thymine
• DNA is “typically” double stranded
– (but not always!)
• RNA is “typically” single stranded
– (but not always!)
Proteins
Amino Acids, Polypeptides and Proteins
• Amino acids
– genetic code specifies 20 different amino acids
– each has an amino group, a carboxyl group, and
a “side group”
• Polypeptides
– polymers of amino acids linked by peptide bonds
between their carboxyl and amino groups
– every polypeptide has an amino terminus (end)
and a carboxyl terminus (end)
• Proteins
– assembled from one or more polypeptides
Amino Acids
Amino Acid Abbreviations
• Standard 3-letter and 1-letter
abbreviations have been devised
• Examples:
Proline
3 letter abbreviation: Pro
1 letter abbreviation: P
Argenine
3 letter abbreviation: Arg
1 letter abbreviation: R
Peptide Chains
Peptide Chains Fold in to Complex
Shapes
Gene Expression
• Transcription
– DNA sequence is transcribed to RNA
sequence
– messenger RNA (mRNA) is synthesized
on a DNA template
• Translation
– mRNA sequence is translated to amino
acid sequence of protein
– each amino acid is specified by a
combination of three nucleotides (a codon)
Directionality of Transcription and Translation
• A DNA or RNA strand has a 5’end and a 3’end
• Polypeptides have an amino end and a carboxyl end
• For DNA replication and RNA transcription…
– nucleotides are added to 3’end
• For translation
– amino acids are add to carboxyl terminus
DNA
DNA
5’-CCTAAAAGT-3’
3’-GGATTTTCA-5’
RNA
5’-CCUAAAAGU-3’
Polypeptide amino-ProLysSer-carboxyl
Messenger RNA
Messenger RNA (mRNA)
– Transcribed from DNA
– Translated to amino acid sequences on ribosomes
Coding vs. non-coding regions
– Only part of the mRNA sequence is translated to
proteins (coding region)
– Biological systems must determine where the
coding region begins and ends
– Rules for finding start of coding regions are
difficult to define with precision
The Genetic Code
• Triplet codon
– each amino acid specified by 3 ribonucleotides
• Unambiguous
– each triplet sequence specifies only 1 amino acid
• Degenerate
– more than one triplet may specify the same amino acid
• Ordered
– Degenerate codons for the same amino acid are similar,
usually differing in the 3rd base
• Punctuation
– There is punctuation for the start and stop of a polypeptide
– There is no punctuation between codons (“commas”)
Reading Frame
• the position of the first translated nucleotide
determines the reading frame (where every
codon starts)
• there are 3 possible reading frames for each
strand
(if translation starts on the 4th nucleotide, that’s the
same reading frame as starting on the 1st , the 5th
is the same as the 2nd, etc.
• frameshift mutations are insertions or
deletions of nucleotides that shift the reading
frame
Reading Frames
Frameshift
Frameshift Mutations
• Frame shift mutations occur if the
number of bases inserted or deleted is
not a multiple of 3
• Translation past the frame shift results
in either a “junk” polypeptide, or early
termination from a stop codon
Start and Stop Codons
• In addition to specifying amino acids,
codons (triplets of nucleotides) mark the
starting and ending points of translation
• Start and stop codons aren’t at the ends
of RNA sequences, they depend on the
reading frame
Start Codon
• In bacteria:
– Start codon is AUG
– Normally codes for methionine (Met)
– Distinction between a start codon and a Met
codon depends on the neighboring sequence
– Sometimes GUG is used as start codon
• In higher organisms (plants, animals, etc):
– AUG is also the start codon
– Also used for Met
– Depends on neighboring sequence
Termination of Transcription
• Three codons are “stop codons”
– UAA, UAG, UGA
• Sometimes called “nonsense” codons
because they don’t code for any amino
acids
• nonsense mutation - changes an amino
acid to a stop codon
Split Genes
• Protein coding sequences are not
always contiguous
• In DNA sequence of a gene, coding
region may consist of multiple exons
separated by introns
• Introns are removed (splicing) from
RNA sequence before translation
The Almost Universal Genetic Code
• In general, all viruses, prokaryotes and eukaryotes
use the same genetic code
• Indicates a single ancestor for all forms of life
• There are a few minor variations – for example
– in mitochondria
• UGA is Trp instead of stop
• AUA is Ile instead of met
– in some protozoa
• UAA (and sometimes UAG) is Gln instead of stop
Introns are Removed by Splicing
Theory vs. Practice
Theory
• Simple rules
• Clean data
• Textbook examples
• Simple (elegant)
programming
approaches
• Focus on Algorithms
Practice
• Rules are broken
• Errors in data
• Real-life examples
• Simple programming
approaches may be
dead ends
• Focus on robustness
Acknowledgements
• None of the materials presented is
original
• All of them are compiled from books,
journal articles, and from other PPT
materials available on the WEB
Example: DNA Sequences
Theory
• Sequences are fully
characterized
ATCGGGC
• Start of translation
known
• Contiguous coding
region
• Universal genetic code
Practice
• Some nucleotides are
ambiguous
AT (C or G?) GGGC
• Start of translation
unknown
• Coding region split into
exons
• Exceptions to genetic
code
Download