Proximal promoter

advertisement
Inferring Transcriptional
Regulation Using
Transctiptomics
Carsten O. Daub
September 1st, 2014
StratCan Summer School 2014
Vår Gård, Saltsjöbaden
Overview –
Levels of Regulation
• Genome
– SNP
– DNA modifications (e.g. methylation)
– structural alterations (e.g. genomic rearrangements)
• Transcriptome
–
–
–
–
–
–
Transcription factors, enhancers/ insulators
Promoter
RNA splicing
miRNA
Posttranscriptional modifications (e.g. RNA editing)
3D structure of the genome
• Protein
– Translation
– Posttranslational modifications
• Metabolites
Central Dogma of Molecular
Biology
DNA
Transcription
RNA
Translation
Protein
Francis Crick, 1958
Non coding RNA
What is the transcriptome?
• The ensemble of all expressed RNA
• Protein coding genes
• Non-protein coding genes
How is the Transcriptome
regulated?
• Via Promoter
– Transcription factors
– enhancers
– insulators
• RNA splicing
• miRNA
• Posttranscriptional modifications (e.g.
RNA editing)
• 3D structure of the genome
Regulation via the Promoter
Transcription
• The principle:
DNA is copied into RNA by the RNA
polymerase (Pol)
5’
Pol
3’
• Transcription initiation is more complex in eukaryotes
than in prokaryotes
• In eukaryotes several different factors are necessary for
the transcription of an RNA polymerase II promoter.
http://en.wikipedia.org/wiki/Gene
• Initiation
– Promoter
clearance
– Pol2 stalling
• Elongation
• Termination
Figures from http://en.wikipedia.org/wiki/Transcription_(genetics)
Transcription Model
5’
Pol
3’
Transcription
Pre-mRNA
(precursor)
Capping ( )
Splicing
Polyadenylation
mRNA
AAAAAAAAAAA
Transcription Factor (TF) Binding
• TFs bind to
specific sites in
the DNA
• Sets of TFs can
function as cisregulatory
modules (CRM)
Nature Reviews Genetics 5, 276-287 (April 2004)
Specific TF Binding
• Transcription factors bind to specific DNA
sequences
• Databases of TF binding sequence motifs
– JASPAR, TRANSFAC
IRF8 binding motif
DNA
IRF8
Promoter Region
Transcription start site (TSS)
Distal promoter
[-10k, -250]
Proximal promoter
[-250, -34]
Core promoter
[-34, -1]
Promoter Region
• Core promoter –
the minimal portion of the promoter required to properly initiate
transcription
–
–
–
–
Transcription Start Site (TSS)
Approximately -34
A binding site for RNA polymerase
General transcription factor binding sites
• Proximal promoter –
the proximal sequence upstream of the gene that tends to contain
primary regulatory elements
– Approximately -250
– Specific transcription factor binding sites
• Distal promoter –
the distal sequence upstream of the gene that may contain
additional regulatory elements, often with a weaker influence than
the proximal promoter
– Anything further upstream (but not an enhancer or other regulatory
region whose influence is positional/orientation independent)
– Specific transcription factor binding sites
Transcription in eukaryotes
• In eukaryotes, several different factors are necessary for
the transcription of an RNA polymerase II promoter.
Name
Location
RNA transcribed
RNA Polymerase I
nucleolus
ribosomal RNA (rRNA)
RNA Polymerase II
nucleus
messenger RNA (mRNA)
and most small nuclear RNAs (snRNAs)
RNA Polymerase III nucleus
transfer RNA (tRNA) and other small
RNAs
Identifying the TF regulators
• How much is a TF binding site used
– Observed expression of all genes
– Predicted site count
• Motif Activity Response Analysis (MARA)
FANTOM4 – A Systems Approach
Monoblast-like THP-1 cells were stimulated by PMA to differentiate them into monocyte-like cells.
10 time point samples were collected during differentiation.
Monocyte-like
Monoblast-like
0
1
2
4
6
12
24
48
72
96 hour
PMA
Replicates
Microarray check
Deep CAGE
RIKEN1
RIKEN3
RIKEN5
RIKEN6
TF qRT-PCR
Not good
Illumina (47K probes)
10 time points
miRNA microarray
Cap Analysis of Gene Expression (CAGE)
CAGE library preparation
CAGE data digital processing
Sequencing
Figure based on [1]
Tag cluster (TC)
1 Carninci,
P. et al. Genome-wide analysis of mammalian promoter architecture and evolution. Nature genetics 38, 626–35 (2006)
CAGE identifies the active set of promoters
Alternative promoter
usage for PTPN6
HeLa Promoter
THP-1 Promoter
Slide modified from Alistair Forrest.
Kanamori-Katayama, Itoh, Kawaji et al. 2011 Genome Research.
“Unamplified cap analysis of gene expression on a single-molecule sequencer”
Transcriptional Regulation
A. TFBS prediction
B. Co-expression
TF A
×: Average expression
No of CAGE tags
In each promoter
CAGE tags
Gene B
CAGE Promoter
Gene C
●
●
×
●
◆
●
◆
×
■
●
×
■
◆
●
×
■
●
× ×
■
■
■
◆
◆
◆
×
■
Gene D
0h
TFBS prediction
A: basis: TFBS prediction
B: co-expression
◆
×
96h
Co-expression
=
Total score
TF A  promoter B
High
High
TF A  promoter C
High
High
TF A  promoter D
Low
Low
Motif Activity Response Analysis –
MARA
eps
Genome
Promoter1
m1 m1 m1 m2 m3
Promoter2
・・・・
PromoterX
m1
m1
m4
m5
Expression
e ps m Rpm Ams
Reaction efficiency
• Number of possible binding sites
Effective
THP-1 cells
are a monoblastic
leukemia
• Degree
of conservation
of cell
the line
motifwhich upon PMA treatment can differentiate into an
concentration
+
+
adherent• monocyte
likestatus
cell (CD14 , CSF1R )
Chromatin
Suzuki, Forrest, van Nimwegen et al. Nature Genetics 2009, 41:5
Motif Activity Response
Analysis
• How much is a
binding site used
– Observed
expression of all
promoters over
time
– Predicted site
count
Suzuki, Forrest, van Nimwegen et al. Nature Genetics 2009, 41:5
NatGenet.
Genet.2009
2009May;41(5):553-62.
May;41(5):553-62.
Nat
Enhancers
• Enhancers are sequence motifs
• They bind factors (proteins) that are
participating in the transcription initiation
complex
• Enhancers can be many kb away from the
TSS
• Insulators are acting in a similar way, but
repressing expression
• Is an enhancer a gene?
Enhancer RNA
• ENCODE reported (Nature, 489(7414), 101–108)
– Enhancers identified by co-occurrence of
H3K27ac and H3K4me1 ChIP-Seq data,
centred on P300 binding sites, in HeLa cells
• Enhancers make non-coding RNA
Nature 465, 173–174 (2010).
• Widespread transcription at neuronal
activity-regulated enhancers.
(Kim, T. K. et al. Widespread transcription at neuronal activityregulated enhancers. Nature 465, 182–187 (2010).)
Djebali, S., Davis, C. A., Merkel, A., Dobin, A., Lassmann, T., Mortazavi, A., et
al. (2012). Landscape of transcription in human cells. Nature, 489(7414), 101–
108. doi:10.1038/nature11233
RNA splicing in cancer
http://en.wikipedia.org/wiki/RNA_splicing
Example: Melanoma Transcriptome
• discovery of aberrations that contribute to
carcinogenesis
• characterize the spectrum of cancerassociated mRNA alterations through
integration of transcriptomic and structural
genomic data
– 11 novel melanoma gene fusions produced by
underlying genomic rearrangements
– 12 novel readthrough transcripts
Genome Res. 2010 Apr;20(4):413-27
Melanoma Transcriptome:
Gene Fusion
Connecting genes located on
different chromosomes!
Melanoma Transcriptome:
Gene Read-through
• Genes fusions are ‘private’
– The same gene fusion was not observed in
two melanoma patients (10 samples total)
• Gene fusions in melanoma might not be
the cancer causing events but
consequences
Chromosome Structure
Ref: http://www.sequentiabiotech.com/
http://en.wikipedia.org/wiki/Chromosome_conformation_capture
Mouse ES cells
Dixon, J. R., Selvaraj, S., Yue, F., Kim, A., Li, Y., Shen, Y., et al. (2012). Topological domains in mammalian
genomes identified by analysis of chromatin interactions. Nature. doi:10.1038/nature11082
• Remote ER-a chromatin biding sites are
anchored at gene promoters through long-range
chromatin interactions
• suggesting that ER-a functions by extensive
chromatin looping to bring genes together for
coordinated transcriptional regulation
Nature. 2009 Nov 5;462(7269):58-64
Polymerase II Stalling
stalled
active
No binding
Nature Genetics 39, 1512 - 1516 (2007)
• Pol II ChIP-chip in drosophila embryos
• Stalled genes are highly enriched in developmental control genes
Transcriptional Regulation in
Cancer
From observations to
mechanisms
• Observations => Biomarkers
Download