Gene Expression

advertisement
Gene Expression
Gene expression
• All cells in one organism have the same
DNA. But different cells have very
different functions.
• In each cell at certain times only some
genes are expressed.
• Which genes are expressed at which
times?
Cells
• muscle
•
nerve
Double-stranded DNA
DNA Structure
DNA matching
• Every A forms two weak hydrogen
bonds with T.
• Every T forms two hydrogen bonds with
A.
• Every C forms three weak hydrogen
bonds with G.
• Every G forms three hydrogen bonds
with C.
RNA
• RNA is also a sequence of nucleotides.
• RNA means “ribonucleic acid”.
• DNA means “deoxyribonucleic acid”.
Nucleotides
RNA
DNA Structure
DNA vs RNA
• Both are strings of nucleotides.
• DNA is usually double-stranded; RNA is
single-stranded.
• RNA is usually much shorter than DNA.
• RNA replaces each T by U (uracil).
• DNA contains deoxyribose while RNA
contains ribose. This makes DNA more
stable chemically than RNA.
DNA and RNA
• DNA in your cells is in the nucleus; RNA
can be anywhere in the cell.
• Proteins are made directly using RNA,
not DNA.
Central Dogma
• A protein-coding region of DNA is
copied to messenger RNA (mRNA) by
transcription.
• The mRNA leaves the nucleus and
goes to a ribosome.
• The ribosome uses the mRNA to make
a protein by translation.
Central Dogma
Translating codons
•
•
•
•
•
•
•
•
•
•
•
Ala/A GCT, GCC, GCA, GCG
Arg/R CGT, CGC, CGA, CGG, AGA, AGG
Asn/N AAT, AAC
Asp/D GAT, GAC
Cys/C TGT, TGC
Gln/Q CAA, CAG
Glu/E GAA, GAG
Gly/G GGT, GGC, GGA, GGG
His/H CAT, CAC
Ile/I
ATT, ATC, ATA
START ATG
Leu/L
Lys/K
Met/M
Phe/F
Pro/P
Ser/S
Thr/T
Trp/W
Tyr/Y
Val/V
STOP
TTA, TTG, CTT, CTC, CTA, CTG
AAA, AAG
ATG
TTT, TTC
CCT, CCC, CCA, CCG
TCT, TCC, TCA, TCG, AGT, AGC
ACT, ACC, ACA, ACG
TGG
TAT, TAC
GTT, GTC, GTA, GTG
TAG, TGA, TAA
Protein primary structure
3D views of proteins
DNA for beta hemoglobin
•
ATGGTGCATCTGACTCCTGAGGAGAAGTCTGCCGTTACTGCCCTG
TGGGGCAAGGTGAACGTGGATGAAGTTGGTGGTGAGGCCCTGGG
CAGGCTGCTGGTGGTCTACCCTTGGACCCAGAGGTTCTTTGAGTC
CTTTGGGGATCTGTCCACTCCTGATGCTGTTATGGGCAACCCTAA
GGTGAAGGCTCATGGCAAGAAAGTGCTCGGTGCCTTTAGTGATGG
CCTGGCTCACCTGGACAACCTCAAGGGCACCTTTGCCACACTGAG
TGAGCTGCACTGTGACAAGCTGCACGTGGATCCTGAGAACTTCAG
GCTCCTGGGCAACGTGCTGGTCTGTGTGCTGGCCCATCACTTTGG
CAAAGAATTCACCCCACCAGTGCAGGCTGCCTATCAGAAAGTGGT
GGCTGGTGTGGCTAATGCCCTGGCCCACAAGTATCACTAA
Primary structure for beta
hemoglobin
• MVHLTPEEKSAVTALWGKVNVDEVGG
EALGRLLVVYWTQRFFESFGDLSTPD
AVMGNPKVKAHGKKVLGAFSDGLAHL
DNLKGTFATLSELHCDKLHVDPENFRL
LGNVLVCVLAHHFGKEFTPPVQAAYQ
KVVAGVANALAHKYH
Part of the two strands for
beta hemoglobin
• ATGGTGCATCTGACTCCT…
• TACCACGTAGACTGAGGA…
• The top is the sense or template; the
bottom is the antisense or coding
strand.
Transcription:
Make mRNA
• ATGGTGCATCTGACTCCT… sense
• TACCACGTAGACTGAGGA… coding
• AUGGUGCAUCUGACUCCU… mRNA
Structure of mRNA
mRNA goes to a ribosome,
outside the nucleus
• AUGGUGCAUCUGACUCCU… mRNA
Eukaryotic cell
• (1) nucleolus
• (2) nucleus
• (3) ribosomes (little
dots)
• (5) rough endoplasmic
reticulum (ER)
• (9) mitochondria
• (10) vacuole
• (11) cytoplasm
Ribosomes
• The ribosome functions as a factory to
make proteins. It uses two kinds of
input:
• (a) mRNA
• (b) tRNA
• It outputs a protein.
Ribosome translates mRNA
• Ribosome (2) straddles mRNA (1)
• It makes the protein (3).
• It starts at AUG and ends at UAG
Ribosome large subunit
Transfer RNA (tRNA)
• Each tRNA molecule has on one side a
conformation that binds to the specific
codon and on the other side a
conformation that binds to the
corresponding amino acid.
tRNA
• CCA tail in orange,
Acceptor stem in
purple, D arm in red,
Anticodon arm in
blue with Anticodon
in black, T arm in
green.
tRNA carries the amino acid
matched to the codon
• UAC … M tRNA will bind with the
codon AUG in the mRNA.
• CAC … V tRNA will bind with the
codon GUG in the mRNA.
mRNA in a ribosome has the
genetic information
• AUGGUGCAUCUGACUCCU…
• UAC … M tRNA will bind with the
codon AUG.
• CAC … V tRNA will bind with the
codon GUG.
Translating codons
•
•
•
•
•
•
•
•
•
•
•
Ala/A GCT, GCC, GCA, GCG
Arg/R CGT, CGC, CGA, CGG, AGA, AGG
Asn/N AAT, AAC
Asp/D GAT, GAC
Cys/C TGT, TGC
Gln/Q CAA, CAG
Glu/E GAA, GAG
Gly/G GGT, GGC, GGA, GGG
His/H CAT, CAC
Ile/I
ATT, ATC, ATA
START ATG
Leu/L
Lys/K
Met/M
Phe/F
Pro/P
Ser/S
Thr/T
Trp/W
Tyr/Y
Val/V
STOP
TTA, TTG, CTT, CTC, CTA, CTG
AAA, AAG
ATG
TTT, TTC
CCT, CCC, CCA, CCG
TCT, TCC, TCA, TCG, AGT, AGC
ACT, ACC, ACA, ACG
TGG
TAT, TAC
GTT, GTC, GTA, GTG
TAG, TGA, TAA
mRNA goes to a ribosome
• AUGGUGCAUCUGACUCCU… mRNA
• UAC …. M tRNA
• CAC … V tRNA
• The ribosome matches UAC on tRNA
with AUG on mRNA, then uses the M
on the other end in the protein.
mRNA goes to a ribosome
• AUGGUGCAUCUGACUCCU… mRNA
• UAC …. M tRNA
• CAC … V tRNA
• The ribosome matches CAC on tRNA
with GUG on mRNA, then uses the V on
the other end to extend the protein.
Ribosome
• In this manner, the ribosome continues
to make the protein until it reaches a
STOP codon.
When is a given gene being
expressed?
• A given protein is being made when its
mRNA is present in the cell.
• The DNA is always present.
When is a given gene being
expressed?
• To tell what is being expressed at a
given time in a given cell, find out which
mRNAs are present.
• For each kind of mRNA, measure the
quantity present.
A microarray
Microarrays
• A microarray consists of a pattern of
thousands of features.
• Each feature has some DNA that will probe
and possibly bind with an mRNA sample.
• Typically the feature is made to fluoresce
under the presence of binding mRNA.
• The brightness of the dot corresponds to the
quantity of mRNA of the given sort that is
present.
Two gene chips
Microarrays
• Typically the probe is attached to a solid
surface which is a glass or silicon chip.
It is then called a gene chip or
Affymetrix microarray.
Introns
• Introns are inserts in the DNA within portions
that code for one protein.
• The parts that code are exons.
Introns must be removed to
make the mature mRNA
cDNA
• Complementary DNA (cDNA) is DNA
synthesized from mature mRNA using
reverse transcriptase.
• AUGGUGCAUCUG mRNA
• TACCACGTAGAC
cDNA
cDNA
• cDNA is more stable than RNA.
• cDNA corresponds with the part of the
genome from which introns have been
removed.
• cDNA does not correspond exactly to
nuclear DNA.
The mature mRNA
The probes
• Each dot can contain DNA, cDNA, or an
oligonucleotide (oligo).
• An oligonucleotide is a short fragment of
single-stranded DNA, typically 5 to 50
nucleotides long.
Gene expression profiling
• In an mRNA or gene expression profiling
experiment the expression levels of
thousands of genes are monitored
simultaneously in parallel. This can be used
to distinguish
• (a) the effects of certain treatments
• (b) the effects of diseases
• (c) the effects of different stages of
development.
Gene expression profiling
• For example, microarrays can identify
genes whose expression is changed in
response to pathogens by comparing
gene expression in infected cells to that
in uninfected cells.
A microarray experiment
• Suppose there are two cells--type 1,
healthy, and type 2, diseased. Both
have four genes A, B, C, D. We want
to compare the expression of these
genes in the two types of cell.
Procedure
• 1. Prepare the DNA chip using the chosen
target DNAs.
• 2. From the cells, isolate the mRNA.
• 3. Use the mRNA as templates to generate
cDNA with a fluorescent tag attached.
Typically a green fluorescent tag is used for
mRNA from healthy cells, while a red tag is
used for mRNA from diseased cells.
Procedure
• 4. Prepare a hybridization solution with a
mixture of the fluorescently labeled cDNAs.
• 5. Incubate the hybridization solution with the
DNA chip.
• 6. Detect bound cDNA using laser
technology.
• 7. Analyze the data.
Appearance afterwards
Interpreting colors
•
•
•
•
A spot with just healthy cDNA is green.
A spot with just diseased cDNA is red.
A spot with both is yellow.
A spot with neither is black.
Comparison of cells
• Microarrays are used to compare the
genome content in different cells for the
same organism.
Single Nucleotide
Polymorphisms
• A single nucleotide polymorphism is a
single substitution in the genome.
• Example:
• AUGGUGCAUCUGACUCCU standard
• AUGGUGUAUCUGACUCCU SNP
Detecting SNPs
• Microarrays can be used to detect
SNPs between or within populations.
• This can measure predisposition to
diseases or identify appropriate drugs.
How are chips made?
• In spotted microarrays the probes may be
small fragments of DNA. An array of fine
needles is controlled by a robotic arm that is
dipped into wells containing the DNA probes.
Each needle then deposits a probe at the
desired location on the surface. The probes
are fixed to the surface. Then the chip is
ready to be washed in a solution containing
the targets.
DNA microarray being printed
by a robot
Flexibility of microarrays
• Thus scientists can produce arrays from
their own labs, customized to an
experiment.
Bioinformatics problems
• 1. How long should the probes be (i.e.,
how many nucleotides long)?
• If too short, you get false signals.
• If too long, it is expensive.
Bioinformatics problems
• 2. Which parts of a sequence should be
cloned in the probe?
DNA for beta hemoglobin
•
ATGGTGCATCTGACTCCTGAGGAGAAGTCTGCCGTTACTGCCCTG
TGGGGCAAGGTGAACGTGGATGAAGTTGGTGGTGAGGCCCTGGG
CAGGCTGCTGGTGGTCTACCCTTGGACCCAGAGGTTCTTTGAGTC
CTTTGGGGATCTGTCCACTCCTGATGCTGTTATGGGCAACCCTAA
GGTGAAGGCTCATGGCAAGAAAGTGCTCGGTGCCTTTAGTGATGG
CCTGGCTCACCTGGACAACCTCAAGGGCACCTTTGCCACACTGAG
TGAGCTGCACTGTGACAAGCTGCACGTGGATCCTGAGAACTTCAG
GCTCCTGGGCAACGTGCTGGTCTGTGTGCTGGCCCATCACTTTGG
CAAAGAATTCACCCCACCAGTGCAGGCTGCCTATCAGAAAGTGGT
GGCTGGTGTGGCTAATGCCCTGGCCCACAAGTATCACTAA
Statistical issues in
microarrays
• 1. There is variability in how well each
probe in the microarray was made.
• 2. There is variability in how uniformly
the target got washed across the chip.
• 3. There is variability in how accurately
the probe binds with the target.
Statistical questions
• What level of expression is statistically
significant?
• If there are 20,000 probes, a 95%
confidence means there are ? events
with probability less than 5%.
Statistical questions
• What level of expression is statistically
significant?
• If there are 20,000 probes, a 95%
confidence means there are 1000
events with probability less than 5%.
Statistical issues
• How can the data be normalized (ie,
compared with known probability
distributions, like the normal curve)?
• P values: there will be false positives
and false negatives.
Experimental design issues
• Replication of biological samples
• Replication of RNA samples from each
experiment
• Replicate each spot on the microarray
Data warehousing
• The data bases are huge hence hard to
understand.
Download