PHAR2811 Dale’s lecture 7 The Transcriptome Synopsis: portions of the human genome

advertisement
PHAR2811 Dale’s lecture 7
The Transcriptome
Synopsis: If protein-coding
portions of the human genome
make up only 1.5% what is the
rest doing?
Definitions:
Genome: the total amount of genetic
material, stored as DNA.
• The nuclear genome refers to the DNA in
the chromosomes contained in the
nucleus; in the case of humans the DNA in
the 46 chromosomes. It is the nuclear
genome that defines a multicellular
organism; it will be the same for all
(almost) cells of the organism.
Genome:
• You can have organelle genomes such as
the mitochondrial genome.
• When you want to identify or distinguish
one organism from another, such as in
forensic testing, you investigate the
genome.
Transcriptome:
• The total amount of genetic information
which has been transcribed by the cell.
This information will be stored as RNA.
• This represents some 90% of the total
genomic sequences
• There is ~5X more RNA than DNA in a
cell, most of it rRNA (~80%) and tRNA
(~15%)
Transcriptome:
• The transcriptome is unique to a cell type
and is a measure of the gene expression.
• Different cells within an organism will have
different transcriptomes. Cell types can be
identified by their transcriptome.
Proteome:
• The cell’s complete protein output. This
reflects all the mRNA sequences
translated by the cell.
• Cell types have different proteomes and
these can be used to identify a particular
cell.
• Only 1 – 2% of the genome codes for the
proteome
Non-coding RNA
• Only 1-2 % of the genome codes
for proteins
• BUT a large amount of it is
transcribed; some estimates have
it as high as 98%.
How can the disparity
between the number of
sequences transcribed and
translated be explained?
Non-coding RNA
The difference is the RNA which is an
end in itself.
This non-coding RNA (ncRNA) consists
of :
– the introns of protein coding genes,
– non coding genes (what are these??)
– Sequences antisense to or overlapping
protein coding genes.
Non-coding RNA
•
•
•
•
Ribosomal RNA (rRNA)
Transfer RNA (tRNA)
Small nuclear RNA (snRNA)
Small nucleolar RNA
(snoRNA)
• MicroRNA (miRNA)
• Short interfering RNA (siRNA)
RNA polymerases
• There are 3 RNA polymerases in
eukaryotes: RNA pol I, II & III
• RNA pol I transcribes rRNA, localised to
nucleolus (insensitive to alpha amanitin)
• RNA pol II transcribes mRNA (very sensitive to
alpha amanitin)
• RNA pol III transcribes tRNA and other
small RNAs (less sensitive to alpha amanitin)
RNA polymerases
• All three polymerases have >10 subunits;
500 – 700 kD BIG!!!
• Some of the subunits are unique to each
polymerase
• All have 2 large subunits (>140 kD) similar
in sequence to the b and b’ subunits of
bacterial RNA polymerase (fundamental
catalytic site between the 2 faces conserved throughout
life)
Let’s start with the most complex!
• RNA polymerase II which transcribes
mRNA.
• The primary transcript is a direct copy of
the gene.
• It includes the introns, 5’ and 3’UTRs but
NOT the promoter region
• This process is really complicated
RNA polymerase II abbreviations
•
•
•
•
TATA box
TBP: TATA binding protein
TAFs: TBP associated factors
TFII: transcription factor (RNA pol II); there
are A, B. D, E, F and H
• CTD: C terminal Domain (of RNA pol II)
RNA polymerase II
This is the basal
transcription
apparatus!!
TFIID
TAFs
DNA
TFIIA TATA
TBP
TFIIE
TFIIF
RNA polymerase II
TFIIB TFIIH
Start site
RNA polymerase II
TFIIH is the only
transcription factor with
enzymic activity.
TFIID
TAFs
DNA
TFIIA TATA
TFIIE
TFIIF
RNA polymerase II
TFIIB TFIIH
Start site
TBP
C-terminal Domain
CTD of RNA pol II
The CTD is
phosphorylated by
protein kinases; one is a
subunit of TFIIH
2 subunits of TFIIH
unwind the DNA
RNA polymerase II: elongation
TFIID
TAFs
TATA
TFIIF
RNA polymerase II
DNA
TBP
RNA
Gene Expression
enhancer
Transcriptional
activator
Translational
coactivators and
corepressors
Histone
modification
complex
Chromatin
remodelling
complex
Acts on the
basal machinery
Mediator
TAFs
RNA pol II
TBP
nucleosomes
Other RNA polymerases
• The regulation of eukaryotic gene
expression is the subject of later lectures
• Let’s consider the other polymerases
Infrastructural RNA
• Ribosomal RNA in eukaryotes is actually
4 separate RNA species: 28S RNA, 18S
RNA, 5.8S RNA and 5S RNA.
• The 28S, 18S and 5.8S rRNA are
transcribed as a long precursor pre-rRNA
of 45S.
• The bacterial rRNAs (23S, 16S and 5S)
are also transcribed as one long molecule.
Processing pre-r RNA
• The 5.8S + 28S fragment is cleaved
from the 18S then the 5.8S species
is released, although it remains
hydrogen bonded to the 28S rRNA.
Processing pre-r RNA
• Initially the 45S pre-rRNA is
modified by 2’ O-ribose methylation
at many sites (humans have 106
sites) and the uracils are converted
to pseudouracils.
• This process is guided by snoRNAs
(we will meet them later).
Ribosomal RNA
• The rRNA is then modified by methylation
at some sites.
• There are many copies of the ribosomal
RNA sequences in the genome (as well as
the histone proteins).
• Some sequences are required by all cells
in such large quantities that they have
multiple copies in the genome.
Infrastructural RNA
• Transfer RNA is also transcribed as
a long precursor containing several
tRNAs joined together.
• Promoter lies within the coding
region
• RNase P releases the separate
tRNAs by cleavage at the 5’ end of
the tRNAs.
RNase P
• RNase P is an interesting enzyme
because it contains both RNA and
protein and it is the RNA component
that is capable of the RNase activity.
• It was this enzyme that led scientists to
the discovery of ribozymes; the RNA
species capable of catalytic activity.
Infrastructural RNA
• The 3’ end of the tRNAs all have a CCA, some
of which are attached after cleavage (some
have the sequence encoded in the DNA). The
attachment is done by a special enzyme.
• The CCA is important as this is where the
amino acid is attached.
• Several of the bases e.g. pseudouracils in
tRNA molecules are modified at this stage.
Other non-coding RNAs.
• Small nuclear RNAs (snRNAs) form
part of the spliceosome which cleaves
the introns out of mRNA precursors.
• There are 5 snRNAs; U1, U2, U4, U5
and you guessed it U6. I have no idea
what happened to U3???
Other non-coding RNAs.
• These RNA species are between 50 and
200 nucleotides long and complex with
proteins to form snRNPs (small nuclear
ribonucleoprotein particles..snurps).
• These small RNAs contribute to the
recognition of splice sites in the mRNA
and in catalysing the breaking and joining
of the mRNA.
Splicing
• Process where the introns are removed
from the pre-mRNA
• Occurs in the nucleus
• Capping (meG at 5’ head) and polyA
tailing at 3’ end carried out first
• Splice sites are defined by a sequence
• Formation of a “lariat” by the spliceosome
(U1, U2, U4, U5 & U6 and ~10 proteins)
Splicing
Branch site
Exon 1
AGGUAAGU
Exon 2
YNYRAY
YYYNCAGG
5’
5’
Y pyrimidine
R purine
N any nuc
5’
AG-OH
AGpG
Lariat formed when 5’
p of the intron G
attaches to 2’ OH of
A
snoRNA
• snoRNA are small nucleolar RNAs between
60 and 300 nucleotides in length.
• RNA editing function
• They recognise their target sequence by
base pairing and then recruit specialised
proteins to perform nucleotide modifications
to these RNAs;
– 2’ O-ribose methylation,
– base deaminations such as adenine to inosine
conversions
– addition of pseudouridines.
snoRNA
• These modifications are crucial to
ribosome biogenesis.
• snoRNAs are derived from introns.
• sno RNAs in conjunction with snRNAs
have been suggested as regulators for
alternative splice sites.
Alternative splicing
• A typical eukaryotic gene consists of
introns and exons.
• The introns are removed by the
spliceosome.
• The exons are joined in the same order
as they appear in the gene sequence.
• In about 60% of human genes certain
exons are missed.
Typical Human Genome
• Human genes typically contain around 10
exons (each of on average about 300bp
in length, with the final exon often being
considerably longer) spanning 9 introns
(which may vary from a few hundred bps
to many kilobases or 100s of kilobases in
length).
Alternative splicing
• This leads to alternative splicing.
• There are some genes with many
different potential exons and these genes
have the potential to form multiple
different mature mRNAs and proteins.
Alternative splicing
introns
exons
Alternative splicing
introns
exons
Spliceosome, made
up of 5 snRNPs and
~150 proteins
Alternative splicing
introns
exons
Spliceosome, made
up of 5 snRNPs and
~150 proteins
OR
introns
exons
OR
introns
exons
snoRNA
• snoRNAs are derived from the introns of
pre-mRNA transcripts, suggesting that
introns are not “junk” DNA.
miRNA and siRNA
• microRNA (miRNA) and short
interfering RNA (siRNA) are very small
RNA molecules, ranging between 21 to 25
nucleotides long.
• These are the hot molecules! They are
seen as the next anti-viral agents, cures
for cancer etc even a replacement for
fossil fuels!!!
miRNA and siRNA
• The 2 species are quite similar, the
variations come from their source or origin.
• MicroRNA comes from short endogenous
hairpin loop structures, synthesised by
RNA pol II, often from within introns.
• The hairpin structures are cleaved in the
nucleus, exported to the cytoplasm and
further processed to ~22 nt duplexes.
Pre-miRNA in the nucleus
Synthesised
by RNA pol II
exon
intron
3’
5’
Drosha
5’
3’
65 – 75 nt stem
loop structure
ready for export
to cytoplasm
Pre-miRNA in the cytoplasm
dicer
dicer
siRNA
5’
3’
3’
5’
3’
RISC
5’
21 – 26 ds RNA
Translational
inhibition of partially
complementary
mRNA
Degradation of
complementary
mRNA
miRNA
• It cuts off the hairpin loop and the 65 75 nt
pre-miRNAs are exported to the cytoplasm
by exportin 5
• It is further processed by another RNase
III endonuclease system, Dicer.
• The mature miRNA s are ~22 nt duplexes
and act usually to repress translation of
target mRNA sequences.
siRNA
• siRNAs are similar but are produced from
long double stranded RNA molecules or
giant hairpin molecules, often of
exogenous origin.
• This whole process is thought to be part of
the cell’s antiviral defense.
siRNA
• Researchers can also introduce their own
double stranded RNA.
• The double stranded molecules are
processed by Dicer, the cytoplasmic
RNase III endonuclease system.
siRNA
• The processed interfering RNA (RNAi) can
catalyse the destruction of endogenous
mRNAs of the same sequence and this
process has been used very successfully
by scientists to silence genes or knock
them down.
How does miRNA and siRNA
regulate gene expression?
• Translation repression of target sequences
• mRNA destruction of target sequences
• Silencing chromatin
Translational Repression
5’UTR
AAAAAAAAAAAAAAAA
3’UTR
Protein
that
binds to
5’UTR
RNA
Recruited proteins
mRNA destruction: sequence
specific targetting siRNA and
miRNA
5’UTR
RNA targets sequence for
destruction
AAAAAAAAAAAAAAAA
3’UTR
Pharmaceutical Applications
• Use of modified anti-miRNA
oligonucleotides (AMOs)
• Complementary to miRNA
• Inhibit a particular miRNA activity
• Example is inhibition of miR-122
• Cholesterol conjugated AMO injected
intraperitoneally (X2 weekly)
Pharmaceutical Applications
• miR-122 is a liver specific miRNA
• Its target gene mRNAs are
sequences involved in cholesterol
regulation
• Increasing the level of the target
mRNAs lowers cholesterol
Pharmaceutical Applications
• The AMO lowered the miR-122
which increased the target mRNA
levels
• This resulted in significantly
reduced plasma cholesterol levels
after 4 weeks
AMO to miR-122
miR-122
Inactivation
of miR-122
Introduce the AMO,
a stabilised
complementary
oligonucleotide to
miR-122, given
intraperitoneally X2
weekly
Inhibits translation of
target mRNAs:
involved in
cholesterol
regulation in liver
miR-122 target mRNAs
increase  lower plasma
cholesterol
Download