Modeling and Simulation of Reaction Networks Matlab codes gill1 lotka gill1o lotkao bruss Our Boolean approach to gene regulation invoked the logic rule of each operon without questioning the underlying biophysical mechanism. The key players are the gene (stretch of DNA) and its reader (RNA Polymerase). Governing genes regulate the transcription of their subjects by either promoting or prohibiting the binding of RNAP to the subject's stretch of DNA. This suggests that we might benefit from a tool that simulates the interactions that occur in a network of reacting chemical species. In the next two weeks we will consider two standard means for modeling and simulating such systems. The first is associated with the name of Gillespie and the second with the names of Michaelis and Menten. Stochastic Chemical Kinetics This is a beautifully simple procedure for capturing the dynamics (kinetics) of a randomly interacting (stochastic) bag of chemicals (us?). We begin with N reacting chemical species, S1, S2, ..., SN, and their initial quantities X1, X2, ..., XN We suppose that these species interact via M distinct reactions R1, R2, ..., RM and that these reactions occur with individual propensities. c1, c2, ..., cM For example, if R1 is X1+X2&rarr X3 then we suppose that the average probability that a particular S1S2 pair react within time dt is c1dt. If, at time t, there are X1 molecules of S1 and X2 molecules of S2 then there are X1X2 possible pairs and hence the probabilty of reaction R1 occuring in the window (t,t+dt) is X1X2c1dt. For more general reactions we will write hm for the number of distinct Rm molecular reactant combinations available in the state (X1,...,Xn). We then arrive at the reaction probability am=hmcm. We shall have occasion to call on a0 = a1+a2+...+aM Given this list of reactions and their propensities one naturally asks Which reaction is likely to happen next? and When is it likely to occurr? Gillespie answered both questions at once by calculating P(T,m)dT = probability that, given the state (X1,...,Xn) at time t, the next reaction will be reaction m and it will occur in the interval (t+T,t+T+dT). A careful reading permits us to write P(T,m)dT as the product of two more elementary terms, namely, the probability that no reaction occurs in the window (t,t+T), and the probability that reaction m occurs within time dT of T. We have already seen the latter, and so, denoting the former by P0(T), we find that P(T,m)dT = P0(T)amdT Regarding P0(T), as the probabilty that no reaction will occur in (t,t+dT) is 1-a0dT it follows that P0(T+dT) = P0(T)(1-a0dT) or (P0(T+dT)-P0(T))/dT = -a0P0(T) and which, in the limit of small dT, states P0'(T) = -a0P0(T) and so P0(T) = exp(-a0T) It now comes down to drawing or "generating" a pair (T,m) from the set of random pairs whose probablity distribution is P(T,m) = amexp(-a0T) Gillespie argues that this is equivalent to drawing two random numbers, r1 and r2, from the [0,1]uniform distribution and then solving (G1) r1 = exp(-a0T) for T and solving (G2) a1+a2+...+am-1 < r2a0 < a1+a2+...+am for m. We have now assembled all of the ingredients of Gillespie's original algorithm: Gather propensities, reactions, initial molecular counts, and maxiter from user/driver. Set t = 0 and iter = 0; While iter < maxiter Plot each Xj at time t Calculate each aj Generate r1 and r2 Solve (G1) and (G2) for T and m Set t = t + T Adjust each Xj according to Rm Set iter = iter + 1 End The gathering of reactions, like the gathering of wire and rule, may take some thought. For small networks we can meet the problem head on. We reproduce two examples from Gillespie. The first is his equation (29) X&infin + Y &rarr 2Y with propens ity c1 (G29) 2Y &rarr Z with propens ity c2 Here the subscript on X&infin indicates that its level remains constant (either because it corresponds to a constant feed or it is in such abundance that the first reaction will hardly diminish it) throughout the simulation. Here is the code that produced the figure at right. Though poorly documented you can nonetheless see each step of the algorithm. Our second example is Gillespie's equation (33) (G33) X&infin + Y1 &rarr 2Y1 with propensity c1 Y1 + Y2 &rarr 2Y2 with propensity c2 Y2 &rarr Z with propensity c3 Roughly speaking, the Y1 species procreates, is consumed by the Y2 species, which itself dies off at a fixed rate. Such models are known as predator-prey and, one can imagine, that the two populations often keep one another in check. For example, here is the code that generated the lovely figure below. Of course for larger reaction networks we, I mean you, must proceed more systematically. On the road to a more general robust method we wish to address 1. The user is unsure of how to choose maxiter and would prefer to provide the final time, tfin. 2. The user may enter the reaction list as a cell and code the update of h in a subfunction and so preclude a very lengthy if clause. 3. A single run of Gillespie is rarely sufficient - for it delivers but one possible time evolution. The user would prefer that we made nr runs and collected and presented statistics across runs. We may incorporate the first point by exchanging Gillespie's for for a while. To address the second point, I imagine a cell where each entry encodes a full reaction, say, e.g., by adopting the following convention rtab{k}(1:2:end) = indicies of reaction species rtab{k}(2:2:end) = actions for species above for the kth reaction. In the Lotka example, this would read rtab = {[1 1] [1 -1 2 1] [2 -1]} % Y1 -> Y1 + 1 % Y1 -> Y1 - 1 % Y2 -> Y2 - 1 and the update function would look like function h = update(y) h(1) = y(1); and Y2 -> Y2 + 1 h(2) = y(1)*y(2); h(3) = y(2); The use of update is relatively straightforward, e.g., a = c.*h; With this a in hand you may now use cumsum and find to find the next reaction index (in just a couple of lines - and with NO ifs). With this index in hand you may visit the proper element of rtab. To address the third point above, we should suppress plotting on the fly and instead return the time and desired solution vectors to a driver that will collect and plot statistics. In particular, if we lay our Gillespie runs along the rows of a big matrix and take means (averages) down the columns, e.g., for j=1:nr [t,x] = mygill(tfin,rtab,x0,c); X(j,:) = x; end avg = mean(X,1); we "ought" to be on the right track. The trouble is that each of our Gillespie runs are loyal to particular time vectors. One way around this is to legislate a uniform time vector tvec = 0:tinc:tfin and bring the individual x vectors into X only after interpolating them with interp1. Deterministic Chemical Kinetics The stochastic view is a bare-handed means of following the dynamics of individual molecules. For large reacting systems this approach may be prohibitively expensive and/or unnecessarily precise. In such cases one may rely on more coarse grained tools, e.g., the Law of Mass Action: The rate of a chemical reaction is directly proportional to the product of the effective concentrations of each participating reactant. We will come to understand this law by its application. We revisit (G29) in the context of a fixed volume V and reaction rates k1 and k2 (in units of 1/M/s) where M designates Molar (which itself is short for moles per volume) and s designates seconds, and write X&infin + Y -> 2Y (propensity c1) (G29) 2Y -> Z (propensity c2) If we denote the concentration of X by [X] then the Law of Mass Action dictates that (G30) d[Y(t)]/dt = 2c1[X&infin][Y(t)] (c1[X&infin][Y(t)]+2(c2/2)[Y(t)]2) gains losses The connection bewteen the rates and the propensities k1 = c1V and k2 = c2V/2 is worked out in section IIC on page 2343 of Gillespie. Now, if, we denote the initial concentration [Y(0)] = Y0 we may solve the ordinary differential equation (G30) by hand (via calculus or dsolve('DY = k1X*Y-2*k2*Y^2','Y(0)=Y0') in matlab) and find [Y(t)] = Y0k1[X&infin] --------------------------------2Y0k2 - (2Y0k2-k1[X&infin])exp(-k1[X&infin]t) This function is easy to graph and so compare with its stochastic cousin. Assuming V=1 we arrive at the figure below on running gill1o.m We see that the deterministic model does an excellent job of tracking the stochastic trajectory. This is, of course, but one simple example. The Law of Mass Action applied to the Lotka system (G38) X&infin + Y1 -> 2Y1 (propensity c1) Y1 + Y2 -> 2Y2 (propensity c2) Y2 -> Z (propensity c3) yields d[Y1]/dt = c1[X&infin][Y1] - c2[Y1][Y2] (G39) d[Y2]/dt = c2[Y1][Y2] - c3[Y2] Now each of the ks differ from the cs by a power of V. The solution of this system is not presentable and so we turn to approximate, or numerical, means. Matlab sports an entire suite of programs for solving differential equations. The default program is ode23. We invoke it in our Lotka 2-way code and arrive at the figure below. As above blue is the Gillespie trajectory and red is the ode trajectory. As further practice we consider a third example. Namely, the Brusselator of Gillespie, equation (44) X&infin,1 -> Y1 X&infin,2 + Y1 -> Y2 + Z1 (propensity c1) (propensity c2) (G44) 2Y1 + Y2 -> 3Y1 Y1 -> Z2 (propensity c3) (propensity c4) becomes d[Y1]/dt = c1[X&infin,1]- c2[X&infin,2][Y1] + (3-2)(c3/2)[Y1]2[Y2] - c4[Y1] (G45) d[Y2]/dt = c2[X&infin,2][Y1] - (c3/2)[Y1]2[Y2] This associated ode code produces the figures below. You may wish to zoom in to better see the transitions that each species undergoes, or better yet, take a look at the phase-plane. To animate this replace the last "plot" in bruss.m with "comet". % % % % % % % % % % % gill1.m Run Gillespie example, eqn (29) usage: gill1(y,maxi,inc) y = initial # of y molecules maxi = number of iterations to be taken inc = number of iterations between displayed results example: gill1(3000,30000,100) function gill1(y,maxi,inc) c1x = 5; c2 = 0.005; t = 0; plot(t,y,'.') hold on for i=1:maxi, a1 = c1x*y; a2 = c2*y*(y-1)/2; a0 = a1 + a2; r1 = rand; tau = (1/a0)*log(1/r1); t = t + tau; r2 = rand; if r2 > a1/a0 y = y - 2; % apply reaction 2 else y = y + 1; end if mod(i,inc) == 1 plot(t,y,'.') end end % apply reaction 1 hold off xlabel('time','fontsize',16) ylabel('number of y molecules','fontsize',16) title('Gillespie (29): x+y->2y->z, c_{1}x = 5, c_{2} = 0.005','fontsize',16) **************************************** % % % % % % % % % % % lotka.m Run Gillespie example, eqn (38) usage: lotka(y,maxi,inc) y = initial # of [y(1) y(2)] molecules maxi = number of iterations to be taken inc = number of iterations between displayed results example: lotka([2000 2000],100000,100) function lotka(y,maxi,inc) c1x = 10; c2 = 0.01; c3 = 10; t = 0; plot(y(1),y(2),'.') hold on for i=1:maxi, a1 = c1x*y(1); a2 = c2*y(1)*y(2); a3 = c3*y(2); a0 = a1 + a2 + a3; r1 = rand; tau = (1/a0)*log(1/r1); t = t + tau; r2 = rand; if r2 > (a1+a2)/a0 y(2) = y(2) - 1; % apply reaction 3 elseif r2 > a1/a0 y(2) = y(2) + 1; % apply reaction 2 y(1) = y(1) - 1; else y(1) = y(1) + 1; % apply reaction 1 end if mod(i,inc) == 1 plot(y(1),y(2),'.') end end hold off xlabel('number of y_{1} molecules','fontsize',16) ylabel('number of y_{2} molecules','fontsize',16) title('Lotka via Gillespie','fontsize',16) axis equal ****************************************** % % % % % % % % % % % gill1o.m Run Gillespie example, eqn (29), two ways usage: gill1o(y,maxi,inc) y = initial # of y molecules maxi = number of iterations to be taken inc = increment between plotted points example: gill1o(3000,30000,10) function gill1o(y,maxi,inc) c1x = 5; c2 = 0.005; t = 0; y0 = y; plot(t,y,'.') hold on for i=1:maxi, a1 = c1x*y; a2 = c2*y*(y-1)/2; a0 = a1 + a2; t = t + log(1/rand)/a0; if rand > a1/a0 y = y - 2; else y = y + 1; end % apply reaction 2 % apply reaction 1 if mod(i,inc)==0 plot(t,y,'.') end end k1X = c1x; k2 = c2/2; time = linspace(0,t,5000); dety = y0*k1X./(y0*2*k2 - (y0*2*k2-k1X)*exp(-k1X*time)); plot(time,dety,'r') hold off xlabel('time','fontsize',16) ylabel('number of y molecules','fontsize',16) title('Gillespie (29) Stochastic (blue) vs. Deterministic (red)','fontsize',16) ******************************************** % % % % % % % % % % % % lotkao.m Run Gillespie example, eqn (38) both stochastically and deterministicly usage: lotkao(y,maxi,inc) y = initial # of [y(1) y(2)] molecules maxi = number of iterations to be taken inc = number of iterations between displayed results example: lotkao([2000 2000],100000,100) function lotkao(y,maxi,inc) c1x = 10; c2 = 0.01; c3 = 10; y0 = y; t = 0; plot(y(1),y(2),'.') hold on for i=1:maxi, a1 = c1x*y(1); a2 = c2*y(1)*y(2); a3 = c3*y(2); a0 = a1 + a2 + a3; r1 = rand; tau = (1/a0)*log(1/r1); t = t + tau; r2 = rand; if r2 > (a1+a2)/a0 y(2) = y(2) - 1; % apply reaction 3 elseif r2 > a1/a0 y(2) = y(2) + 1; y(1) = y(1) - 1; % apply reaction 2 y(1) = y(1) + 1; % apply reaction 1 else end if mod(i,inc) == 1 plot(y(1),y(2),'.') end end % now solve the lotka ode system and plot the solution [time,y] = ode23(@(t,y) lotkasys(t,y,c1x,c2,c3),[0 t],y0'); plot(y(:,1),y(:,2),'r') xlabel('number of y_{1} molecules','fontsize',16) ylabel('number of y_{2} molecules','fontsize',16) title('Lotka Two Ways','fontsize',16) axis equal hold off % % % the Lotka pair of ordinary differential equations function dy = lotkasys(t,y,k1x,k2,k3) dy(1,1) = k1x*y(1) - k2*y(1)*y(2); dy(2,1) = k2*y(1)*y(2) - k3*y(2); ****************************************** % % bruss.m % % Solve The Brusselator System % % usage: bruss(tfin,y0,c) % % where tfin = final time % y0 = initial state % c = 4 rates % % ex: bruss(10,[1e3; 1e3],[5e3 5e1 5e-5 5]) % function bruss(tfin,y0,c) [t,y] = ode23(@(t,y) sysbruss(t,y,c),[0 tfin],y0); plot(t,y) legend('y_1','y_2') xlabel('t','fontsize',16) ylabel('y','fontsize',16) title('Solution to (G45), The Brusselator','fontsize',16) figure plot(y(:,1),y(:,2)) xlabel('y_1','fontsize',16) ylabel('y_2','fontsize',16) title('The Brusselator Limit Cycle','fontsize',16) function dy = sysbruss(t,y,c) dy(1,1) = c(1) - c(2)*y(1) + (c(3)/2)*y(1)^2*y(2) - c(4)*y(1); dy(2,1) = c(2)*y(1) - (c(3)/2)*y(1)^2*y(2);