Initiation of Transcription: (see also: http://en.wikibooks.org/wiki/An_Introduction_to_Molecular_Biology/Transcription_of_RNA_and_its_modification) The TATA box is the binding site for a transcription factor known as TATA-binding protein (TBP), which is itself a subunit of another transcription factor, called Transcription Factor II D (TFIID). After TFIID binds to the TATA box via the TBP, five more transcription factors and RNA polymerase combine around the TATA box in a series of stages to form a preinitiation complex. One transcription factor, DNA helicase, has helicase activity and so is involved in the separating of opposing strands of double-stranded DNA to provide access to a single-stranded DNA template. However, only a low, or basal, rate of transcription is driven by the preinitiation complex alone. Other proteins known as activators and repressors, along with any associated coactivators or corepressors, are responsible for modulating transcription rate. Thus, preinitiation complex contains: 1. Core Promoter Sequence 2. Transcription Factors 3. DNA Helicase 4. RNA Polymerase 5. Activators and Repressors From: nsmb0907-788.pdf Modeling Progress of RNA Polymerase along DNA Pseudocode 1. User can specify a. Number of initiation complex factors b. Total length of nascent chain. c. Residence time between jumps d. What about i. Possibility of early termination ii. 2. Code should allow for stochasticity in ‘connection’ for all of the Transcription Factors 3. Set it up so each jump is modeled independently 4. The basic unit: Multiple Transcription Factors Transcription Factor I(?) binds to TATA box Transcription Initiation Complex RNA Polymerase II Notes, thoughts, etc. 1. Darzacq et al. (2007) is the paper that her “crazy” ideas stem from. Regarding Fig. 3: a. b. The only way I can replicate their Fig. 3 (the output of the model) is to model the temporal dynamics of the Promoter using the first term of the statistical fit presented in Fig. 1. Then, adding that to the output of the model for Initiating and Engaged fits the data well. 2. aicbic - Akaike and Bayesian information criteria Syntax AIC = aicbic(LLF,NumParams) [AIC,BIC] = aicbic(LLF,NumParams,NumObs) 3. In a Poisson process, 𝑃{[𝑁(𝑡 + 𝜏) − 𝑁(𝑡)] = 𝑘} = characterized by orderliness – i.e., 𝑒 −𝜆𝑡 (𝜆𝑡)𝑘 𝑘! , because the process is lim 𝑃[𝑁(𝑡 + Δ𝑡) − 𝑁(𝑡) > 1|𝑁(𝑡 + Δ𝑡) − 𝑁(𝑡) ≥ 1] = 0 Δ𝑡→0 and memorylessness, the waiting times between occurrences is a negative exponential. Proof: Let 𝜏1 be the first arrival time of the Poisson process. Its distribution satisfies 𝑃𝑟[𝑁𝑥+𝑑𝑡 > 0, 𝑁𝑥 = 0] 𝑑𝑡→0 𝑑𝑡 1 − 𝑃𝑟[𝑁𝑑𝑡 = 0] = lim 𝑃𝑟[𝑁𝑥 = 0] 𝑑𝑡→0 𝑑𝑡 1 − [1 − 𝜆𝑑𝑡 + 𝑂(𝑑𝑡 2 )] −𝜆𝑥 = lim 𝑒 𝑑𝑡→0 𝑑𝑡 𝑃𝑟[ 𝜏1 = 𝑥] = lim = 𝜆𝑒 −𝜆𝑥 4. Good papers: a. AndKurJuly10.pdf, SURVEY_AndKurtz.pdf (they’re similar, maybe the same paper?) b. Discrete Time Markov Chains.pdf c. 5. *** Relevant Links 1. http://en.wikibooks.org/wiki/An_Introduction_to_Molecular_Biology/Transcription_of_ RNA_and_its_modification 2. http://en.wikipedia.org/wiki/Rate_equation 3. http://www.web-books.com/MoBio/Free/Chap4.htm 4. 5. 6. 7. 8. http://nimbios.org/tutorials/TT_stochastic_modeling http://en.wikipedia.org/wiki/Gillespie_algorithm http://web.mit.edu/biophysics/papers.html http://www.youtube.com/watch?v=Fa4skYBJHoI http://www.math.cas.cz/~vejchod/gillespiessa/gillespiessa.html - download Gillespie SSA Matlab package (gillespiessa_rel1.tar.gz) . 9. Matlab Programs 1. tattaiR.m *** 2. *** 𝑠𝑢𝑚(𝑦) = 𝑄 → 𝑠𝑢𝑚(𝑦) = 𝑄𝑠𝑢𝑚(𝑦) + 𝑄𝑥 → 𝑠𝑢𝑚(𝑦)(1 − 𝑄) = 𝑄𝑥 → 𝑥 𝑠𝑢𝑚(𝑦) + 𝑥 (1 − 𝑄) = 𝑠𝑢𝑚(𝑦) 𝑄 Pseudocode 1. TFIID binds to the TATA box via the TBP 2. At least 5 other factors bind to TFIID – in a series of stages – to produce preinitiation complex (PIC) a. One of the TF’s has DNA helicase activity. b. Activators/co-activators and repressors/co-repressors also required Need to treat it as a Markov chain? I.e., 1. If TBP is bound to DNA, then TFIID can bind to TATA box. 2. 3. 4. 5. 6. 7. If TFIID is bound to TATA box, then TF#2 can bind If TF#2 is bound to TFIID, then TF#3 can bind If TF#3 is bound to TFIID, then TF#4 can bind If TF#4 is bound to TFIID, then TF#5 can bind IF TF#5 is bound to TFIID, then TF#6 can bind IF TF#6 is bound to TFIID, then transcription can start. Could set up a vector to hold the state (0,1) for the binding of each of the 7 factors. Then, use a switch statement could do this case V_test == [1 1 0 0 0 0 0] if the test returns 1 (i.e., TRUE, meaning TBP and TFIID are both bound), then enter a code snippet that computes test_var = rand() and compares it with (1 − 𝑘𝑛 ). If test_var > (1 − 𝑘𝑛 ) then the next TF in the sequence can bind to the ‘growing’ complex. Otherwise, enter code snippet that allows the last-to-bind TF to dissociate (or not). 𝑘 = 0.1 10% chance of binding during each time step 𝑟𝑎𝑛𝑑() output: random variate 10% < 0.1, 10% > 0.90 3. ***