Olga_project

advertisement
Initiation of Transcription:
(see also: http://en.wikibooks.org/wiki/An_Introduction_to_Molecular_Biology/Transcription_of_RNA_and_its_modification)
The TATA box is the binding site for a transcription factor known as TATA-binding protein
(TBP), which is itself a subunit of another transcription factor, called Transcription Factor II D
(TFIID). After TFIID binds to the TATA box via the TBP, five more transcription factors and
RNA polymerase combine around the TATA box in a series of stages to form a preinitiation
complex. One transcription factor, DNA helicase, has helicase activity and so is involved in the
separating of opposing strands of double-stranded DNA to provide access to a single-stranded
DNA template. However, only a low, or basal, rate of transcription is driven by the preinitiation
complex alone. Other proteins known as activators and repressors, along with any associated
coactivators or corepressors, are responsible for modulating transcription rate.
Thus, preinitiation complex contains:
1. Core Promoter Sequence
2. Transcription Factors
3. DNA Helicase
4. RNA Polymerase
5. Activators and Repressors
From: nsmb0907-788.pdf
Modeling Progress of RNA Polymerase along DNA
Pseudocode
1. User can specify
a. Number of initiation complex factors
b. Total length of nascent chain.
c. Residence time between jumps
d. What about
i. Possibility of early termination ii.
2. Code should allow for stochasticity in ‘connection’ for all of the Transcription Factors
3. Set it up so each jump is modeled independently
4.
The basic unit:
Multiple
Transcription
Factors
Transcription
Factor I(?)
binds to
TATA box
Transcription
Initiation
Complex
RNA
Polymerase II
Notes, thoughts, etc.
1. Darzacq et al. (2007) is the paper that her “crazy” ideas stem from. Regarding Fig. 3:
a.
b. The only way I can replicate their Fig. 3 (the output of the model) is to model
the temporal dynamics of the Promoter using the first term of the statistical
fit presented in Fig. 1. Then, adding that to the output of the model for
Initiating and Engaged fits the data well.
2. aicbic - Akaike and Bayesian information criteria
Syntax
AIC = aicbic(LLF,NumParams)
[AIC,BIC] = aicbic(LLF,NumParams,NumObs)
3. In a Poisson process, 𝑃{[𝑁(𝑡 + 𝜏) − 𝑁(𝑡)] = 𝑘} =
characterized by orderliness – i.e.,
𝑒 −𝜆𝑡 (𝜆𝑡)𝑘
𝑘!
, because the process is
lim 𝑃[𝑁(𝑡 + Δ𝑡) − 𝑁(𝑡) > 1|𝑁(𝑡 + Δ𝑡) − 𝑁(𝑡) ≥ 1] = 0
Δ𝑡→0
and memorylessness, the waiting times between occurrences is a negative exponential.
Proof: Let 𝜏1 be the first arrival time of the Poisson process. Its distribution
satisfies
𝑃𝑟[𝑁𝑥+𝑑𝑡 > 0, 𝑁𝑥 = 0]
𝑑𝑡→0
𝑑𝑡
1 − 𝑃𝑟[𝑁𝑑𝑡 = 0]
= lim
𝑃𝑟[𝑁𝑥 = 0]
𝑑𝑡→0
𝑑𝑡
1 − [1 − 𝜆𝑑𝑡 + 𝑂(𝑑𝑡 2 )] −𝜆𝑥
= lim
𝑒
𝑑𝑡→0
𝑑𝑡
𝑃𝑟[ 𝜏1 = 𝑥] = lim
= 𝜆𝑒 −𝜆𝑥
4. Good papers:
a. AndKurJuly10.pdf, SURVEY_AndKurtz.pdf (they’re similar, maybe the same
paper?)
b. Discrete Time Markov Chains.pdf
c.
5. ***
Relevant Links
1. http://en.wikibooks.org/wiki/An_Introduction_to_Molecular_Biology/Transcription_of_
RNA_and_its_modification
2. http://en.wikipedia.org/wiki/Rate_equation
3. http://www.web-books.com/MoBio/Free/Chap4.htm
4.
5.
6.
7.
8.
http://nimbios.org/tutorials/TT_stochastic_modeling
http://en.wikipedia.org/wiki/Gillespie_algorithm
http://web.mit.edu/biophysics/papers.html
http://www.youtube.com/watch?v=Fa4skYBJHoI
http://www.math.cas.cz/~vejchod/gillespiessa/gillespiessa.html - download Gillespie
SSA Matlab package (gillespiessa_rel1.tar.gz) .
9.
Matlab Programs
1. tattaiR.m ***
2.
***
𝑠𝑢𝑚(𝑦)
= 𝑄 → 𝑠𝑢𝑚(𝑦) = 𝑄𝑠𝑢𝑚(𝑦) + 𝑄𝑥 → 𝑠𝑢𝑚(𝑦)(1 − 𝑄) = 𝑄𝑥 → 𝑥
𝑠𝑢𝑚(𝑦) + 𝑥
(1 − 𝑄)
=
𝑠𝑢𝑚(𝑦)
𝑄
Pseudocode
1. TFIID binds to the TATA box via the TBP
2. At least 5 other factors bind to TFIID – in a series of stages – to produce preinitiation
complex (PIC)
a. One of the TF’s has DNA helicase activity.
b. Activators/co-activators and repressors/co-repressors also required
 Need to treat it as a Markov chain? I.e.,
1. If TBP is bound to DNA, then TFIID can bind to TATA box.
2.
3.
4.
5.
6.
7.
If TFIID is bound to TATA box, then TF#2 can bind
If TF#2 is bound to TFIID, then TF#3 can bind
If TF#3 is bound to TFIID, then TF#4 can bind
If TF#4 is bound to TFIID, then TF#5 can bind
IF TF#5 is bound to TFIID, then TF#6 can bind
IF TF#6 is bound to TFIID, then transcription can start.
Could set up a vector to hold the state (0,1) for the binding of each of the 7 factors. Then,
use a switch statement could do this
case V_test == [1 1 0 0 0 0 0]
if the test returns 1 (i.e., TRUE, meaning TBP and TFIID are both bound), then enter a
code snippet that computes test_var = rand() and compares it with (1 − 𝑘𝑛 ). If
test_var > (1 − 𝑘𝑛 ) then the next TF in the sequence can bind to the ‘growing’
complex. Otherwise, enter code snippet that allows the last-to-bind TF to dissociate (or
not).
𝑘 = 0.1  10% chance of binding during each time step
𝑟𝑎𝑛𝑑() output: random variate 10% < 0.1, 10% > 0.90

3. ***
Download