University of Cologne Biological Physics II Biological Physics II: Systems Contents 1 Dynamical systems 1.1 Nonlinear dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 Examples of dynamical systems . . . . . . . . . . . . . . 1.1.2 Fixed points . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.3 Linear stability analysis . . . . . . . . . . . . . . . . . . . 1.1.4 Phase portraits . . . . . . . . . . . . . . . . . . . . . . . . 1.1.5 Hartman–Grobman theorem . . . . . . . . . . . . . . . . . 1.1.6 Closed orbits . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.7 Poincaré–Bendixson theorem . . . . . . . . . . . . . . . . 1.2 Bifurcations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 Saddle-node bifurcation . . . . . . . . . . . . . . . . . . . 1.2.2 Transcritical bifurcation . . . . . . . . . . . . . . . . . . . 1.2.3 Pitchfork bifurcation . . . . . . . . . . . . . . . . . . . . . 1.2.4 Hopf bifurcation . . . . . . . . . . . . . . . . . . . . . . . 1.3 Small gene-regulatory networks . . . . . . . . . . . . . . . . . . . 1.3.1 Gene regulation functions . . . . . . . . . . . . . . . . . . 1.3.2 Molecular mechanisms of gene regulation . . . . . . . . . 1.3.3 Cooperativity . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.4 Self-activating gene . . . . . . . . . . . . . . . . . . . . . . 1.3.5 Two mutually repressing genes . . . . . . . . . . . . . . . 1.3.6 A genetic oscillator, the »repressilator« . . . . . . . . . . 1.4 Quasiperiodicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.1 Minimal example: Two incommensurable oscillators . . . 1.4.2 Practical example: The Moon . . . . . . . . . . . . . . . . 1.4.3 Quasiperiodicity in Phase Space . . . . . . . . . . . . . . . 1.5 Chaos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5.1 Lyapunov exponents . . . . . . . . . . . . . . . . . . . . . 1.5.2 Strange attractors . . . . . . . . . . . . . . . . . . . . . . . 1.5.3 What systems are chaotic? . . . . . . . . . . . . . . . . . . 1.5.4 Detecting chaos from data – nonlinear time-series analysis 3 3 3 4 4 5 5 6 6 6 6 7 7 8 10 10 11 12 14 15 17 18 18 18 19 19 20 22 23 24 2 Fluctuations in biological processes 2.1 Random processes in cells . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Poisson processes . . . . . . . . . . . . . . . . . . . . . . . 2.1.2 Birth-death process . . . . . . . . . . . . . . . . . . . . . . 2.1.3 Master equation and Gillespie algorithm . . . . . . . . . . 2.1.4 Processes consisting of many steps . . . . . . . . . . . . . 2.2 Noise in gene expression . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Testing the birth-death process as a stochastic model of transcription . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 Transcription as a bursting process . . . . . . . . . . . . . 2.2.3 Gene expression model including translation . . . . . . . 26 26 26 28 30 31 31 1 31 32 34 University of Cologne 2.2.4 Biological Physics II Functional relevance of gene expression noise . . . . . . . 35 3 Biological pattern formation 3.1 Self-organized patterns in reaction-diffusion systems . . . . . . . 3.1.1 Turing instability . . . . . . . . . . . . . . . . . . . . . . . 3.1.2 Turing patterns in one dimension . . . . . . . . . . . . . . 3.1.3 Activator-inhibitor systems in two dimensions . . . . . . 3.2 Morphogen gradients . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 French flag model and positional information . . . . . . . 3.2.2 Diffusion-degradation model . . . . . . . . . . . . . . . . 3.2.3 Scaling of pattern with tissue size . . . . . . . . . . . . . . 3.2.4 Robustness of morphogen gradients . . . . . . . . . . . . 3.2.5 Precision of morphogen gradients . . . . . . . . . . . . . . 36 36 36 37 38 39 40 40 41 42 43 4 Empirical laws in biology 4.1 Scaling laws in biology . . . . . . . . . . . . . . . . . . . . . . . . 4.1.1 Allometric scaling . . . . . . . . . . . . . . . . . . . . . . 4.1.2 Scaling in the functional content of genomes . . . . . . . 4.2 Bacterial growth laws . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Dependence of cellular DNA content on growth rate . . . 4.2.2 Dependence of cellular ribosome content on growth rate . 4.2.3 Minimal model of proteome allocation . . . . . . . . . . . 47 47 47 48 48 49 50 51 2 University of Cologne Biological Physics II 1 Dynamical systems The goal of this part is to introduce basic concepts of dynamical-systems theory, how dynamical systems can be analyzed, and how their key properties can be understood without solving the equations exactly. This framework is extremely useful for the theoretical description and analysis of diverse biological systems. 1.1 Nonlinear dynamics The general framework of dynamical systems can be used to describe (»model«) many phenomena in biology, including gene regulation networks, neuronal networks, population dynamics etc. The most common form of (continuous-time) dynamical systems is described by a set of differential equations of the form 𝑥⃗̇ = d𝑥⃗ = 𝑓⃗(𝑥), ⃗ d𝑡 where the function 𝑓⃗ can be nonlinear. More generally, 𝑓⃗ can depend explicitly on time: 𝑓⃗ = 𝑓⃗(𝑥, ⃗ 𝑡). ⃗ If 𝑓 (𝑥) ⃗ is sufficiently smooth, solutions to these equations exist and are unique. The solutions 𝑥(𝑡) ⃗ are called trajectories. It is straightforward to obtain these solutions numerically, e.g., by using Euler’s forward integration method 𝑥(𝑡 ⃗ + 𝛥𝑡) = 𝑥(𝑡) ⃗ + 𝑓⃗(𝑥(𝑡))𝛥𝑡 ⃗ starting at an initial value 𝑥(𝑡 ⃗ = 0) = 𝑥⃗0 . If 𝑓⃗ is a linear function of 𝑥, ⃗ it is relatively easy to calculate an analytical solution. However, many real-world phenomena are nonlinear, and only few such systems can be solved analytically. Thus, such nonlinear systems are harder to study. Dynamical-systems theory provides a framework for analyzing nonlinear systems. 1.1.1 Examples of dynamical systems Growth of a population of bacteria: 𝑁̇ = 𝑔𝑁 , with the cell number 𝑁 ≥ 0 and the growth rate 𝑔 > 0. The solution of this linear equation is 𝑁 (𝑡) = 𝑁0 exp(𝑔𝑡). Logistic population growth: An example for a nonlinear dynamical system is given by the logistic growth equation: 𝑁̇ = 𝑔𝑁 (1 − 𝑁 /𝐾 ) Here, 𝐾 is the carrying capacity which is the maximum population size that can be sustained (e.g. due to limited nutrient availability). This is a rare example that can be solved analytically; the solution is: 𝑁 (𝑡) = 𝐾 𝑁0 exp(𝑔𝑡) 𝐾 + 𝑁0 (exp(𝑔𝑡) − 1) 3 University of Cologne Biological Physics II 1.1.2 Fixed points The fixed points of a dynamical system are points 𝑥⃗ for which 𝑓⃗(𝑥) ⃗ = 0. Stability of fixed points: A fixed point 𝑥⃗∗ is stable (an attractor) if the system returns to 𝑥⃗∗ upon any small perturbation 𝛿 𝑥, ⃗ i.e. trajectories starting at 𝑥⃗∗ + 𝛿 𝑥⃗ return to 𝑥⃗∗ for 𝑡 → ∞ for any 𝛿 𝑥, ⃗ provided that ‖𝛿 𝑥‖ ⃗ is sufficiently small. Example: The logistic growth equation 𝑁̇ = 𝑔𝑁 (1 − 𝑁 /𝐾 ) has an unstable fixed point at 𝑁 = 0 and a stable fixed point at 𝑁 = 𝐾 . 1.1.3 Linear stability analysis Linear stability analysis provides a systematic way to test if a fixed point 𝑥⃗∗ is stable. Using 𝑥⃗ = 𝑥⃗∗ + 𝛿 𝑥, ⃗ we linearize the system around 𝑥⃗∗ : 𝛿 𝑥⃗̇ = 𝑥⃗̇ = 𝑓⃗(𝑥) ⃗ = 𝑓⃗(𝑥⃗∗ + 𝛿 𝑥) ⃗ = 𝑓⃗(𝑥⃗∗ ) + D𝑓⃗(𝑥⃗∗ ) ⋅ 𝛿 𝑥⃗ + 𝒪(𝛿 𝑥⃗2 ) Here, D𝑓⃗(𝑥⃗∗ ) is the Jacobian matrix at 𝑥⃗∗ : 𝜕𝑓1 ⎡ 𝜕𝑥1 ⎢ 𝜕𝑓2 D𝑓⃗ = ⎢ 𝜕𝑥1 ⎢ ⋮ ⎢ 𝜕𝑓𝑁 ⎣ 𝜕𝑥1 𝜕𝑓1 𝜕𝑥2 𝜕𝑓2 𝜕𝑥2 ⋮ 𝜕𝑓𝑁 𝜕𝑥2 … … ⋱ … 𝜕𝑓1 𝜕𝑥𝑁 ⎤ 𝜕𝑓2 ⎥ 𝜕𝑥𝑁 ⎥ ⋮ ⎥ 𝜕𝑓𝑁 ⎥ 𝜕𝑥𝑁 ⎦ For one-dimensional systems, the Jacobian is simply the first derivative d𝑓 . Since 𝑓⃗(𝑥⃗∗ ) = 0, we have 𝛿 𝑥⃗̇ = D𝑓⃗(𝑥⃗∗ ) ⋅ 𝛿 𝑥⃗ + 𝑂(𝛿 𝑥⃗2 ). | d𝑥 𝑥=𝑥∗ The stability of the fixed point 𝑥⃗∗ is determined by the eigenvalues of the Jacobian D𝑓⃗(𝑥⃗∗ ). A fixed point is … stable (an attractor) if the real part of all eigenvalues of D𝑓⃗(𝑥⃗∗ ) is smaller than 0. unstable if the real part of at least one eigenvalue of D𝑓⃗(𝑥⃗∗ ) is greater than 0. a repellor if the real part of all eigenvalues of D𝑓⃗(𝑥⃗∗ ) is greater than 0. a saddle if the real part of at least one eigenvalue of D𝑓⃗(𝑥⃗∗ ) is greater and that of at least one eigenvalue of D𝑓⃗(𝑥⃗∗ ) smaller than 0. 4 University of Cologne Biological Physics II marginal if the real part of at least one eigenvalue is 0. The eigenvalues of D𝑓⃗(𝑥⃗∗ ) also determine the time scale of relaxation towards the fixed point: For an attractor 𝑥⃗∗ , the inverse of the real part of the largest eigenvalue is the relaxation time. To see where these results come from, note that the solution of the linearized system 𝛿 𝑥⃗̇ = D𝑓⃗(𝑥⃗∗ ) ⋅ 𝛿 𝑥⃗ has the form 𝛿 𝑥(𝑡) ⃗ = exp(D𝑓⃗(𝑥⃗∗ )𝑡)𝛿 𝑥⃗0 . We can change coordinates so that D𝑓⃗(𝑥⃗∗ ) becomes diagonal and the different dimensions uncouple. The result in each dimension is proportional to exp(𝜆𝑖 𝑡), where 𝜆𝑖 is an eigenvalue of the Jacobian. In one dimension, the dynamics of the linearized system is given by 𝛿𝑥̇ = 𝜆𝛿𝑥 d𝑓 with 𝜆 = d𝑥 |𝑥=𝑥 , and the solution is simply 𝛿𝑥(𝑡) = 𝛿𝑥0 exp(𝜆𝑡) = 𝛿𝑥0 exp(ℜ(𝜆)𝑡) ∗ (where ℜ(𝑧) denotes the real part of the complex number 𝑧). The basin of attraction of a (stable) fixed point 𝑥⃗∗ is the set of initial points 𝑥⃗0 whose trajectories lead to 𝑥⃗∗ for 𝑡 → ∞. Example: Lotka-Volterra model of species competition (see [33] p. 155). 1.1.4 Phase portraits Using a phase portrait, we can understand key qualitative properties of the dynamics of a system without having a complete solution. For each initial point 𝑥⃗0 in phase space, the dynamical system defines a trajectory 𝑥(𝑡). ⃗ A phase portrait consists of these trajectories for many initial points. In practice, we can calculate many trajectories numerically and simply plot them. Even without resorting to numerical solutions, key features of a phase portrait can be captured. In particular, we can plot: • The fixed points (stable and unstable). • The trajectories near the fixed points (obtained from linear stability analysis). An educated guess for the complete phase portrait can then often be made, making use of the fact that different trajectories never intersect (because the solutions are unique). Example: Phase portrait for the Lotka–Volterra model above 1.1.5 Hartman–Grobman theorem While a linear approximation should provide useful insights, it is not clear if linear stability analysis accurately captures the behavior of the full nonlinear system near a fixed point. The Hartman–Grobman theorem states that near a fixed point (with 5 University of Cologne Biological Physics II the real parts of all eigenvalues different from zero), the linearized system describes the exact behavior of the real system. There is a homeomorphism that maps the linearized to the real dynamics and vice versa. 1.1.6 Closed orbits Closed orbits are periodic solutions which return to the same point 𝑥⃗ in phase space after a finite time 𝑇 , i.e. 𝑥(𝑡 ⃗ + 𝑇 ) = 𝑥(𝑡). ⃗ Example: Harmonic oscillator 𝑚𝑥 ̈ + 𝑘𝑥 = 0. We convert this second-order differential equation into a dynamical system by introducing 𝑥 ̇ = 𝑣. A solution of this system is 𝑥(𝑡) ∝ sin(𝜔𝑡), 𝑣(𝑡) ∝ cos(𝜔𝑡) Hence, the trajectories are closed orbits in phase space. With a damping term, the system becomes 𝑚𝑥 ̈ +𝜇 𝑥 ̇ +𝑘𝑥 = 0, and the trajectories become stable spirals that converge on the stable fixed point (𝑥, 𝑣) = (0, 0). 1.1.7 Poincaré–Bendixson theorem Poincaré–Bendixson theorem: If there is a trajectory that is confined to a closed bounded region of a two-dimensional system and there are no fixed points inside this region, then there is a closed orbit inside the bounded region. The Poincaré–Bendixson theorem implies that there is no chaos in two dimensions. 1.2 Bifurcations Usually, the flow only changes a bit (e.g. the position of a fixed point shifts a bit) when a system parameter is slightly changed. However, sometimes the qualitative structure of the flow can change abruptly as a parameter of the system is varied. Such abrupt changes are called bifurcations. 1.2.1 Saddle-node bifurcation In a saddle-node bifurcation, a stable and an unstable fixed point annihilate as a parameter is varied. Example: 𝑥̇ = 𝑟 + 𝑥 2 For 𝑟 < 0 this system has a stable fixed point at 𝑥 = −√|𝑟| and an unstable fixed point at 𝑥 = √|𝑟|. For 𝑟 > 0, there are no fixed points. Thus, the system undergoes a saddle-node bifurcation at 𝑟 = 0. 6 University of Cologne Biological Physics II Importantly, such bifurcations are generic, i.e. there are only relatively few different types of bifurcations. For example, for any dynamical system that has a saddle-node bifurcation, the dynamics »looks« like 𝑥 ̇ = 𝑟 + 𝑥 2 near this bifurcation upon suitable rescaling of the system variables. This form is called the normal form of the bifurcation. This result is closely related to the fact that smooth paths, surfaces, and higher-dimensional manifolds can only intersect each other in a limited number of ways. Here, we do not go into the depth that would be required to make this statement precise; see [12] for details. Understanding which bifurcations occur is a powerful way of analyzing complicated dynamical systems. In particular, close to the bifurcation point, general results constrain the dynamics of the system. 1.2.2 Transcritical bifurcation In a transcritical bifurcation, a fixed point changes its stability (from unstable to stable) as a system parameter is varied. Example (normal form): 𝑥 ̇ = 𝑟𝑥 − 𝑥 2 = (𝑟 − 𝑥)𝑥 There is a fixed point at 𝑥∗ = 0 for all values of 𝑟. For 𝑟 < 0, this fixed point is stable and there is an additional unstable fixed point at 𝑥∗ = 𝑟. For 𝑟 > 0, the fixed point at zero is unstable and there is a stable fixed point at 𝑥∗ = 𝑟. Thus, an exchange of stability between the two fixed points occurred at the critical point 𝑟 = 0. 1.2.3 Pitchfork bifurcation Certain bifurcations can occur in systems that have symmetries. In a pitchfork bifurcation, a fixed point splits into a symmetric pair of fixed points and an additional fixed point with opposite stability. 7 University of Cologne Example: Biological Physics II Normal form of the (supercritical) pitchfork bifurcation 𝑥 ̇ = 𝑟𝑥 − 𝑥 3 Note that this equation is invariant under parity 𝑥 → −𝑥 (the symmetry in this case). For 𝑟 < 0, there is a stable fixed point at 𝑥∗ = 0. For 𝑟 > 0, there are two stable fixed points at 𝑥∗ = ±√𝑟 and the fixed point at 𝑥∗ = 0 has become unstable. At the critical point 𝑟 = 0, there is only the fixed point at 𝑥∗ = 0; it has become marginal. Solutions do no longer approach this fixed point exponentially fast but much more slowly, following an algebraic decay. This slow dynamics at the critical point of the bifurcation is called critical slowing down. Subcritical pitchfork bifurcation: 𝑥 ̇ = 𝑟𝑥 + 𝑥 3 1.2.4 Hopf bifurcation In a Hopf bifurcation, a fixed point loses stability as a limit cycle is created. A limit cycle is an isolated closed orbit. Here, isolated means that trajectories in close proximity of the limit cycle are not closed orbits, but spiral towards or away from the limit cycle. Like a fixed point, a limit cycle can be stable or unstable. It is stable if all trajectories starting near the limit cycle converge on the limit cycle for long times. Thus a Hopf bifurcation provides a way to change from a non-oscillating to a stably oscillating state. This is relevant to understand the dynamics of various oscillators in living systems (e.g. circadian clock, cell cycle, hair bundles in the ear, etc.). 8 University of Cologne Biological Physics II Example: 𝑟 ̇ = 𝜇𝑟 − 𝑟 3 𝜃 ̇ = 𝜔 + 𝑏𝑟 2 , where we used polar coordinates 𝑥 = 𝑟 cos(𝜃), 𝑦 = 𝑟 sin(𝜃). This system has a fixed point at 𝑟 = 0. Linear stability analysis shows that this fixed point is stable for 𝜇 < 0 and unstable for 𝜇 > 0. The two eigenvalues 𝜆1,2 of the Jacobian matrix have a nonzero imaginary part and cross the imaginary axis from left to right at 𝜇 = 0. This behavior is a hallmark of a Hopf bifurcation. The trajectories near the fixed point are stable spirals that converge on 𝑟 = 0 for 𝜇 < 0. They change into spirals that converge on a limit cycle for 𝜇 > 0. Near the bifurcation point, the frequency of the limit cycle oscillation is approximately 𝜔 = |ℑ(𝜆)|, where ℑ(𝜆) denotes the imaginary part of the (complex) eigenvalue 𝜆. This result holds generally for Hopf bifurcations. Supercritical and subcritical Hopf bifurcation • In a supercritical Hopf bifurcation, the limit cycle stays near the fixed point that lost stability as long as the control parameter is near the critical value 𝜇 = 0. The example discussed above is a supercritical Hopf bifurcation. • In a subcritical Hopf bifurcation, the trajectory jumps to a distant limit cycle as soon as the control parameter crosses the critical value 𝜇 = 0. Example: 𝑟 ̇ = 𝜇𝑟 + 𝑟 3 − 𝑟 5 𝜃 ̇ = 𝜔 + 𝑏𝑟 2 The subcritical Hopf bifurcation shows hysteresis which means that large amplitude oscillations do not disappear when the control parameter is reduced back below the critical value 𝜇 = 0. The stable limit cycle only disappears at a lower value of 𝜇 in what is called a saddle-node bifurcation of cycles where it is annihilated in a collision with an unstable limit cycle. Example: Hair-bundle oscillations In the bullfrog ear, sensory cells called hair cells detect sound waves. Each of these cells has a so-called hair bundle which is an active oscillator that can amplify low amplitude sound waves. 9 University of Cologne Biological Physics II It has been hypothesized that hair bundle oscillators may be tuned to the critical point of a Hopf bifurcation. This scenario is plausible because such a sensory system should not oscillate spontaneously in the absence of sound, i.e., if it exhibits a Hopf bifurcation it should not be tuned above the critical value. On the other hand, being tuned close to the point where spontaneous oscillations occur ensures strong amplification of low amplitude sounds. Hence, the idea is that the hair bundles can detect weak sound waves with high sensitivity at the critical point, but do not yet oscillate spontaneously in the absence of sound (which would not be useful for detecting sound). This hypothesis leads to several predictions that are testable in (sophisticated) experiments: • Hair bundles should mostly rest but occasionally start to oscillate spontaneously (due to fluctuations). • It can be shown that, at the critical point of a Hopf bifurcation, the gain 𝑔 (i.e., the oscillation amplitude over the stimulus force 𝑓 ) generally behaves as 𝑔 ∼ 𝑓 −2/3 . This diverges at 𝑓 = 0, which implies that weak stimuli are drastically amplified. In contrast, strong stimuli are weakened. Notably, experimental data from bullfrog hair cells are consistent with these predictions. Such a mechanism could explain how hair cells can detect sounds that carry less energy than the thermal noise background. For details, see [10]. 1.3 Small gene-regulatory networks The general framework discussed so far can be applied to small gene-regulatory networks, i.e., networks of genes coding for transcription factors (TFs) that control the expression of the genes in the network. In particular, it is useful for understanding the dynamics of networks that consist of a few genes with well-understood regulation. Some of the best characterized examples of such networks come from synthetic biology, i.e. they are artificially constructed with well-characterized molecular components from bacteria. In general, even if the values of the system parameters are not known, dynamical systems theory can be used to find out if a given network is capable of certain dynamic behaviors. 1.3.1 Gene regulation functions In the simplest cases of gene regulation, a transcription factor binds in the promoter region of a gene. A repressor prevents the transcription of the gene by RNA polymerase (RNAP), e.g., by simply blocking the RNAP binding site. An activator facilitates transcription, e.g., by lowering the binding energy of RNAP to the promoter. 10 University of Cologne Biological Physics II The gene regulation function describes the rate 𝑠y at which a protein Y (or its mRNA) is produced as a function of the concentration of a transcription factor X that controls the expression of gene y. Empirical observations show that gene regulation functions are often sigmoidal and generally well-approximated by Hill functions. For an activator, these have the shape 𝑠y (𝑥) = 𝑥𝑛 1 + 𝑥𝑛 𝑠y (𝑥) = 1 . 1 + 𝑥𝑛 and for a repressor Here, 𝑥 is the concentration (or molecule number) of protein X inside a living cell and 𝑛 is the Hill exponent (often called Hill coefficient). The case where only a single activator can bind at the promoter region is captured by 𝑛 = 1. The case 𝑛 > 1 describes cooperative gene regulation, i.e., a situation where multiple TFs can bind to the promoter region simultaneously and the binding of an additional TF becomes more likely if a TF is already bound at the promoter. For simplicity, we use a normalized, dimensionless form of these gene regulation functions here, so that the supremum of 𝑠y (𝑥) is 1 and 𝑠y (1) = 1/2. 1.3.2 Molecular mechanisms of gene regulation To demystify the form of gene regulation functions introduced above, we will go through their detailed derivation for a few simple examples. Consider a regulator 𝑅 (e.g. an activator) that can bind to an operator site 𝑂 on the DNA. 11 University of Cologne Biological Physics II This situation can be described as a chemical reaction: 𝑘+ 𝑅 + 𝑂 ⇌− 𝑅𝑂 𝑘 As common in chemistry, we denote the concentration of 𝑋 by [𝑋 ]. Forward binding occurs at rate 𝑘 + [𝑅], reverse binding at rate 𝑘 − [𝑅𝑂]. The chemical rate equation is: d[𝑅] = 𝑘 − [𝑅𝑂] − 𝑘 + [𝑅][𝑂]. d𝑡 In equilibrium, d[𝑅]/d𝑡 = 0. ⟹ [𝑅𝑂] = [𝑅][𝑂] 𝑘 − 𝑘+ [𝑅][𝑂] ⟹ = + =∶ 𝐾 𝑘− [𝑅𝑂] 𝑘 (law of mass action) Here, 𝐾 is the dissociation constant. Note that 𝐾 is a concentration. Our next goal is to express the fraction of time that the operator is occupied (𝜑o ) as a function of the total concentrations of 𝑅 and 𝑂: ⟹ 𝜑o = [𝑂]T = [𝑂] ⏟ + [𝑅𝑂] ⏟, [𝑅]T = [𝑅] ⏟ + [𝑅𝑂] ⏟ free free bound bound [𝑅]T [𝑅][𝑂] [𝑅] [𝑅𝑂] 1 , = = ≈ [𝑂]T 𝐾 [𝑂] + 1 [𝑅][𝑂] 𝐾 + [𝑅] 𝐾 + [𝑅]T 𝐾 where the last approximation holds because typically [𝑅] ≫ [𝑅𝑂], [𝑂]. Assuming that transcription only occurs when the activator 𝑅 is bound to the operator 𝑂, this result explains the shape for the gene regulation function we introduced above for 𝑛 = 1. 1.3.3 Cooperativity To understand how gene regulation functions with 𝑛 > 1 can come about, consider a situation where the regulator 𝑅 can bind to the operator only as a dimer 𝑅2 . 12 University of Cologne Biological Physics II 𝑘D+ 𝑅 + 𝑅 ⇌− 𝑅2 𝑘D Dimerization equilibrium: 𝑘− [𝑅]2 = 𝐾D = D + [𝑅2 ] 𝑘D Fraction of time that the operator is occupied: [𝑅2 𝑂] [𝑅2 ] [𝑅]2 𝜑o = , = = [𝑂]T ↑ 𝐾 + [𝑅2 ] 𝐾 𝐾D + [𝑅]2 where we used the first-order result from above in the step indicated by the arrow. In the limit [𝑅]T ≪ 𝐾D , most regulators exist as monomers, i.e. [𝑅]T ≈ [𝑅]. ⟹ 𝜑o ≈ [𝑅]2T 𝐾 𝐾D + [𝑅]2T . With 𝐾̃ = √𝐾 𝐾D , this is the gene regulation function used above with 𝑛 = 2. The Hill function with Hill coefficient 𝑛 summarizes and generalizes these expressions: 𝑥𝑛 𝜑o (𝑥) = . 𝐾̃ 𝑛 + 𝑥 𝑛 Even in cases where the molecular mechanisms are different or unknown, this functional form can often capture empirical gene regulation functions to a good approximation, sometimes with non-integer values of 𝑛. For bacterial gene regulation, the gene regulation function can also be calculated using thermodynamic models of gene regulation. This theoretical framework assumes that the binding of TFs and RNAP to the promoter is in thermodynamic equilibrium and the binding probability of RNAP determines the transcription rate from the promoter (which then determines the protein production rate). For details on this approach, see [6, 28]. The relatively simple regulation by a single transcription factor is common in bacteria but keep in mind that genes are often regulated by multiple transcription 13 University of Cologne Biological Physics II factors, which requires more complicated gene regulation functions than the ones described in this section. Gene regulation in eukaryotes is much more complicated and typically involves dozens of transcription factors for each gene. 1.3.4 Self-activating gene Consider a gene coding for a transcription factor X that activates its own transcription. Here, an increase in the concentration of protein X leads to a further amplification of X production. This is an example for a positive feedback loop. Can this system exhibit bistable expression (with two states in which gene X is essentially off or on)? If so, under which conditions? We describe this system with the following equation: 𝑥 ̇ = 𝑠(𝑥) − 𝜅𝑥, where 𝑥 is the concentration of X-proteins in the cell, 𝜅 is the degradation rate of X-proteins and 𝑠(𝑥) is a synthesis term that reflects the self-activation: 𝑠(𝑥) = 𝑥𝑛 . 1 + 𝑥𝑛 Case 𝑛 = 1: The system has one stable fixed point. Depending on the degradation rate 𝜅, this fixed point is located at 𝑥 = 0 or at some 𝑥 > 0. Thus, the system is monostable. This system undergoes a transcritical bifurcation: At a critical value of 𝜅, the stable and the unstable fixed point collide and exchange stability (see section 1.2.2). 14 University of Cologne Biological Physics II Case 𝑛 > 1: For small values of 𝜅, the system has two stable fixed points (one at 𝑥 = 0, one at 𝑥 > 0) and one unstable fixed point. As 𝜅 is increased, a saddle-node bifurcation occurs in which the unstable fixed point and the stable fixed point at 𝑥 > 0 disappear. Overall, there is a parameter regime in which the system is bistable and it can thus act as a genetic switch. Bifurcation diagram: Conclusion: A self-activating gene can show bistability but this depends on (1) a strong nonlinearity in the system (𝑛 > 1, usually achieved by cooperative gene regulation) and (2) the values of system parameters (degradation rate 𝜅 not too large). 1.3.5 Two mutually repressing genes Goal: Build a bistable genetic circuit that can be switched between two states. The first published example for a synthetically built bistable genetic switch is from Jim Collins’ lab at Boston University and used two mutually repressing genes [16]. The repressors used can be switched into an inactive form by adding an inducer. For example, the lac promoter is negatively regulated by the lac repressor. When the lac repressor binds to an allolactose molecule it undergoes a conformational change. Upon this allosteric change its binding affinity to the promoter is greatly reduced, i.e., it has become essentially inactive. It is common to use an artificial inducer (e.g. IPTG) that has the same effect as allolactose on the lac repressor but 15 University of Cologne Biological Physics II has the advantage that it cannot be metabolized by the cell. Here, the idea is that the genetic switch can be flipped using such an inducer. The molecular implementation of this circuit described in [16] uses the lac promoter and another promoter with similar properties. Each promoter drives the expression of the repressor of the other promoter, respectively. We describe the two cross-repressing genes as a dynamical system: 𝑥̇ = 𝛼1 −𝑥 1 + 𝑦𝛽 𝛼2 − 𝑦. 𝑦̇ = 1 + 𝑥𝛾 Here, 𝛼1,2 are the maximal synthesis rates, which occur in the absence of repression. The degradation terms describe dilution due to growth alone. The equations are non-dimensionalized so that the growth rate (effective degradation rate) is 1 and the TF concentrations at half-maximal induction are 1. To find the fixed points we draw the nullclines, i.e., the lines where 𝑥 ̇ = 0 or 𝑦 ̇ = 0. The fixed points are located where the nullclines intersect. It depends on the promoter strengths 𝛼1,2 how many fixed points there are. Two stable and one unstable fixed point only exist if the promoter strengths are similar. Then, the system is bistable with two stable states in which 𝑥 and 𝑦 are high, respectively. The system undergoes a saddle-node bifurcation and becomes mono-stable as the promoter strength 𝛼2 is decreased. 16 University of Cologne Biological Physics II Similarly, we find that at least one of the Hill exponents 𝛽, 𝛾 needs to be greater than one for the system to be bistable. Hence, as for the self-activating gene, strongly nonlinear regulation is needed to create bistability. The genetic switch that was constructed using two cross-repressing genes actually showed clearly bimodal gene expression, could be switched from one state to another within a few hours, and it remained stable in either state for at least 22 h [16]. 1.3.6 A genetic oscillator, the »repressilator« Here, the goal was to build a genetic network that oscillates. Three genes that cyclically repress each other should be able to create oscillations. The simple intuitive picture is that the system tries to find a steady state in which the expression of each of the genes x, y , and z is either high or low. However, no such solution exists, leading to an oscillation. If we indicate high expression levels of a gene with an upper case letter, and low expression levels with a lower case letter, the system goes through a cycle: (𝑋 , 𝑦, 𝑧) → (𝑋 , 𝑦, 𝑍 ) → (𝑥, 𝑦, 𝑍 ) → (𝑥, 𝑌 , 𝑍 ) → (𝑥, 𝑌 , 𝑧) → (𝑋 , 𝑌 , 𝑧) → (𝑋 , 𝑦, 𝑧) A detailed theoretical description of this system consists of six coupled differential equations describing both the mRNA and protein levels for each of the three genes. Numerical simulations show that this system exhibits limit-cycle oscillations in a certain parameter regime [15]. For simplicity, we discuss a simplified version of this system that only describes the protein concentrations and uses fixed values for most parameters: 1 1 − 𝑥 1 + 𝑧𝑛 2 1 1 𝑦̇ = − 𝑦 1 + 𝑥𝑛 2 1 1 𝑧̇ = − 𝑧 𝑛 1+𝑦 2 𝑥̇ = This system has a single fixed point at (1, 1, 1) that can lose its stability and lead to limit cycle oscillations. Linear stability analysis shows that, as 𝑛 is increased above a critical value 𝑛𝑐 = 4, a symmetric pair of eigenvalues crosses the imaginary axis from left to right. Thus, this system exhibits the key characteristics of a Hopf bifurcation. Again, strongly nonlinear gene regulation is necessary for this system to show limit cycle oscillations. The combination of negative feedback with time delay used in the repressilator is a general way of creating oscillations. The first synthetic biology implementation of this system [15] used the lac repressor (LacI from E. coli), which inhibits the transcription of the second repressor (tetR from the tetracycline-resistance transposon Tn10), which in turn inhibits the expression of a third repressor (cI from lambda phage). Finally, CI inhibits lacI. 17 University of Cologne Biological Physics II 1.4 Quasiperiodicity Quasiperiodic dynamics are not particularly relevant in biological applications, but we need to briefly discuss them to be able to properly distinguish them from other phenomena like chaos and to complete the “zoo” of possible dynamics. Two numbers 𝑎 and 𝑏 are called incommensurable iff 𝑎𝑏 ∉ ℚ. A quasiperiodic dynamics is the superposition of two (or more) periodic dynamics with incommensurable period lengths. 1.4.1 Minimal example: Two incommensurable oscillators A straightforward example are two uncoupled harmonic oscillators with incommensurable period lengths, e.g., as described by the following differential equations: 𝑥 ̈ = −𝑥, 𝑦 ̈ = −𝜋 2 𝑦. sin(𝑡) + sin(𝜋𝑡) sin(𝑡) + sin(𝜋𝑡) Here, the decomposition into two separate dynamics is obvious, but usually it is not. For example, if we observe only 𝑥 + 𝑦, we get a dynamics like this: 2 1 0 −1 −2 2 1 0 −1 −2 0 0 𝑡 150 𝑡 20 300 1.4.2 Practical example: The Moon The dynamics of the Moon features three processes with different period lengths, namely the synodic period (≈ 29.53 d), the nodal precession (≈ 6793 d) and the apsidal precession (≈ 3233 d). There is no apparent relation between these period lengths, and we have no reason to assume one, because we know of no strong enough process that would synchronise them, mainly thanks to the almost complete lack of friction in space. For modeling purposes, it is therefore appropriate to assume them as incommensurable and the dynamics as quasiperiodic. We cannot strictly apply the concept of commensurability to these periods since they are empirical and slightly vary over time. However, it is reasonable to assume that they are incommensurable for modelling purposes – the respective dynamics of the Moon can be reasonably described by a quasiperiodic model. This dynamics also illustrates that quasiperiodic dynamics can be predicted on a long term (as opposed to chaotic dynamics) – in principle we only need to know 18 University of Cologne Biological Physics II the phases of all relevant oscillations. Here, a quasiperiodic model allows us to predict solar eclipses on a long term. A converse example are the Moon’s synodic period (orbit around the Earth) and synodic day (rotation around own axis). These are synchronised by a process called tidal locking involving dissipative deformations of the Moon. As a result, both periods are identical (and thus commensurable) and we can only ever see one side of the Moon from Earth. In biological systems, almost everything is coupled (with a relevant coupling strength) and thus quasiperiodicity is rarely an adequate model. 1.4.3 Quasiperiodicity in Phase Space Let us consider the example of the two uncoupled oscillators again. The compound phase space of both oscillators is the square [0, 2𝜋) × [0, 2𝜋) with the left side being equivalent to the right one and the bottom side being equivalent to the top one. If we “glue” a square (or rectangle) together at these equivalent sides, we obtain the surface of a torus: Arrows depict trajectory. 𝜙₂ 2π 𝜙₂ 2π 0 0 Arrows depict equivalent edges. torus 0 2π 2π 𝜙₁ 𝜙₁ Similarly, the phase-space structure of any quasiperiodic dynamics is topologically equivalent to a torus – or hypertorus if there are more than two incommensurable frequencies. 0 1.5 Chaos Besides convergence to an attractor and (quasi)periodic dynamics, there is a third type of bounded deterministic dynamics, namely chaos. Some well-known examples of chaotic systems are the double pendulum (a pendulum attached to a pendulum): … and the Lorenz oscillator, a simplified atmospheric model, described by the following differential equations: 𝑥 ̇ = 𝜎(𝑦 − 𝑥) 𝑦 ̇ = 𝑟𝑥 − 𝑦 − 𝑥𝑧 𝑧 ̇ = 𝑥𝑦 − 𝑏𝑧. 19 University of Cologne Biological Physics II Its temporal evolution looks like this (for 𝜌 = 28, 𝜎 = 10, 𝛽 = 8/3): 20 𝑧 𝑦 𝑥 0 −20 20 0 −20 40 20 0 0 20 40 𝑡 60 We can also use the Lorenz oscillator to illustrate the key feature of chaos, namely sensitivity to initial conditions. The following plot shows the first dynamical variable of two Lorenz oscillators as well as the total distance of their states. At 𝑡 = 30, one oscillator is perturbed with an amplitude of 10−14 . 20 𝑥12 0 −20 20 𝑥11 0 |𝑥 1 − 𝑥 2 | −20 1010 1 10−10 0 30 𝑡 60 90 We see that the states of the two oscillators exponentially diverge until (at 𝑡 ≈ 70) their difference has reached the size of the attractor itself and the two dynamics show visible differences in the plot. This behavior reflects a type of dynamics that is qualitatively different from the regular dynamics discussed so far. The exponential separation of trajectories that start arbitrarily close to each other implies that the system is virtually impossible to predict on larger time scales: Improving the accuracy at which the initial conditions are measured by a constant factor only increases the time for which the system is predictable by an additive constant. This behavior is the fundamental reason why chaotic systems cannot be predicted for long times, e.g., we will never be able to forecast the weather for more than a few days in advance. 1.5.1 Lyapunov exponents Lyapunov exponents quantify the sensitivity to initial conditions (or the absence thereof). The first Lyapunov exponent quantifies the average rate of the exponential divergence (or convergence) of two infinitesimally close trajectories 𝑥⃗ and 𝑦⃗ of the dynamics. Formally, we can define it as: ⃗ + 𝜏 ) − 𝑦⃗(𝑡 + 𝜏 )| |𝑥(𝑡 1 ln( ). 𝜏 →∞ |𝑥(𝑡)−⃗ ⃗ − 𝑦⃗(𝑡)| |𝑥(𝑡) ⃗ 𝑦 (𝑡)|→0 𝜏 𝜆1 ∶= lim lim 20 University of Cologne Biological Physics II Here, the limit 𝜏 → ∞ ensures that we average over the entire dynamics and do not just consider local effects. The first Lyapunov exponent is also called largest Lyapunov exponent or just Lyapunov exponent. 𝘹(𝘵 +𝜏) 𝘹(𝘵 ) 𝘺(𝘵 +𝜏) 𝘺(𝘵 ) When evolving a system, an arbitrary infinitesimally small distance between two trajectories will align itself along the direction of strongest expansion (or weakest contraction, if no expansion happens). This direction depends on the current state and is called (largest) Lyapunov vector. The second Lyapunov exponent 𝜆2 describes the divergence or convergence of perturbations (infinitesimally close trajectories) orthogonal to the first Lyapunov vector. The second Lyapunov vector is defined accordingly. The third Lyapunov exponent 𝜆3 describes the divergence or convergence of perturbations (infinitesimally close trajectories) orthogonal to the first two Lyapunov vectors. And so on. An ODE system has as many Lyapunov exponents as it has dimensions. The sum 𝑑 of Lyapunov exponents is the divergence: ∑𝑖=1 𝜆𝑖 = ∇ · 𝑓⃗. To determine the largest Lyapunov exponent of a dynamics given via 𝑥⃗̇ = 𝑓⃗(𝑥) ⃗ numerically, we can simulate two close initial conditions and fit an exponential curve to the distance of the resulting trajectories over time. This method is usually called orbit separation. A more sophisticated approach is Benettin’s method [5]. It employs a tangent vector 𝑣⃗ representing an infinitesimal deviation from a given trajectory, i.e., an infinitesimally small Lyapunov vector. Its temporal evolution is described by: 𝑣⃗̇ = ∇𝑓⃗(⃗ 𝑦 ) · 𝑣⃗. Practically, we initialise 𝑣⃗ randomly and evolve it simultaneously with the actual dynamics (𝑥⃗̇ = 𝑓 (𝑥)). ⃗ First, we discard transients, i.e., we evolve the system for an appropriate amount of time such that it settles on the attractor and the first Lyapunov vector aligns itself. Afterwards, we can obtain the first Lyapunov exponent from exponential divergence of 𝑣⃗. To avoid numerical over- or underflows, we need to regularly rescale 𝑣⃗. For each further Lyapunov exponent, we need to evolve another tangent vector and regularly remove projections in the directions of larger Lyapunov vectors (which we get as a side results from larger Lyapunov exponents). Many dynamics have a zero Lyapunov exponent corresponding to the direction of the temporal flow. Specifically this applies to every time-continuous, nonbounded, non-fixed-point dynamics [20]. This can be seen as follows: Suppose our initial perturbation for determining the Lyapunov exponent is exactly in the direction of the temporal flow, i.e., 𝑦⃗(𝑡) = 𝑥(𝑡 ⃗ + 𝜀). Since the dynamics is bounded, we can find arbitrarily large 𝜏 such that 𝑥(𝑡 ⃗ + 𝜏 ) ≈ 𝑥(𝑡). ⃗ Since the phase-space flow is continuous, we also have 𝑦⃗(𝑡 + 𝜏 ) = 𝑥(𝑡 ⃗ + 𝜏 + 𝜀) ≈ 𝑥(𝑡 ⃗ + 𝜀) = 𝑦⃗(𝑡) and thus: |𝑥(𝑡 ⃗ + 𝜏 ) − 𝑦⃗(𝑡 + 𝜏 )| ≈ |𝑥(𝑡) ⃗ − 𝑦⃗(𝑡)|. 21 University of Cologne Biological Physics II Hence our trajectories neither diverge nor converge on average and the corresponding Lyapunov exponent must be zero. 𝘹(𝘵 ) 𝘺(𝘵 ) = 𝘹(𝘵 +ε) 𝘹(𝘵 +𝜏) 𝘺(𝘵 +𝜏) = 𝘹(𝘵 +𝜏 +ε) A continuous-time chaotic system thus must at least have the following Lyapunov exponents: • A positive exponent for sensitivity to initial conditions. • A zero exponent for the direction of time. • A negative exponent for the dynamics to be bounded (negative or zero divergence), corresponding to the direction where the dynamics is folded back into itself. As a result, a continuous-time system needs at least three dimensions to be chaotic. More generally, we can classify (continuous-time, bounded) dynamics by the signs of the Lyapunov exponents: signs of Lyapunov exponents dynamics −, −−, −−−, … fixed point +, ++, +++, …, +0, ++0, … not possible (unbounded dynamics) 0, 00, 000, … regular Hamiltonian dynamics 0−, 0−−, 0−−−, … attractive limit cycle 00−, 00−−, 00−−−, … attractive limit torus (quasiperiodic) 000−, 0000−, …, 000−−, … attractive limit hypertorus (quasiperiodic) +0−, +0−−, +0−−−, … chaos ++0−, +++0−, …, ++0−−, … hyperchaos ∞, … noise 1.5.2 Strange attractors So far, we only considered two special cases of attractors, namely fixed points and limit cycles. More generally speaking, an attractor is the dynamics of a system for 𝑡 → ∞ or the sets of states visited by this dynamics. From a different point of view, the attractor is a minimal set of states that is invariant under time evolution and that has a neighborhood of points converging to it. The attractor of the Lorenz system (projected from 3D to 2D) looks like this: 22 University of Cologne Biological Physics II adapted from commons.wikimedia.org/wiki/File: Lorenz_system_r28_s10_b2-6666.png Attractors with chaotic dynamics are strange attractors or fractal attractors, which means that they exhibit a certain degree of self similarity, i.e. they consist of smaller copies of themselves. In the Lorenz attractor, this self similarity manifests in the structure of the trajectory bundles, parts of which are similar to the entire bundle. This property is related to the stretching and folding nature of chaos: During temporal evolution the attractor gets stretched (along Lyapunov vectors corresponding to positive exponents) but also folded onto itself (along Lyapunov vectors corresponding to negative exponents). For illustration consider the following cartoon of the Lorenz attractor: 𝒜 𝛹 (𝒜) 𝛹 (𝒜) −1 𝜆₁>0 ⊙ 𝜆₂=0 −1 𝜆₃<0 𝒜 is a cross-section of the attractor (a Poincaré section). The temporal evolution takes one half of it, stretches it and maps it onto itself. Therefore the entire crosssections contains a small copy of itself. 1.5.3 What systems are chaotic? For the purposes of most applied research, it suffices to define chaos as a deterministic, bounded dynamics with a positive Lyapunov exponent. If we wish to be more rigorous, there are several competing definitions, but they differ only in pathological examples with little to no impact on most applications. Two clear necessary conditions for chaos are: 23 University of Cologne Biological Physics II • nonlinearity: More specifically, for the system 𝑥⃗̇ = 𝑓⃗(𝑥), ⃗ the derivative 𝑓⃗ needs to be nonlinear. The blunt argument for this is that we can solve systems with a linear 𝑓⃗ and they are not chaotic. • three dimensions of the phase space. Mind that this only applies to timecontinuous autonomous systems. Driven systems or maps can be chaotic with fewer dimensions. While not every nonlinear system with at least three dimensions is chaotic, chaos becomes more and more prevalent with stronger non-linearities and higher dimensionality. Therefore, chaos is the default assumption for the dynamics of a deterministic system. The chaoticity of the double pendulum is not remarkable – the regularity of the single pendulum is. Chaos is deterministic, which means that its temporal evolution does not depend on randomness – as opposed to stochastic processes. We can make the following distinction: • For regular processes, identical initial conditions lead to identical outcomes and similar initial conditions lead to similar outcomes. • For chaotic processes, identical initial conditions still lead to identical outcomes, but similar initial conditions do not necessarily lead to similar outcomes. • Stochastic processes yield different outcomes even for identical initial conditions. However, the more complex a deterministic process becomes – be it becoming more nonlinear or increasing in dimension –, the more difficult it is to distinguish from a stochastic process. In particular for many real processes, the question of whether they are actually stochastic or chaotic is moot. Instead the crucial question is whether a chaotic model helps us understand or predict the process in a way that a stochastic model does not. After all, chaoticity and stochasticity are primarily properties of mathematical models. 1.5.4 Detecting chaos from data – nonlinear time-series analysis Unlike fixed-point and periodic dynamics, chaotic dynamics are much more difficult to deduce from observables of a system. Mind that the central question here is not whether there is chaos – in some sense there certainly is, as elaborated above – but whether chaos dominates the dynamics and yields meaningful insights. Also, often the main challenge is to distinguish a chaotic signal from a stochastic one. We here give a brief overview of the important tools from nonlinear time-series analysis that allow us to do this [21]. The central tool is the delay embedding (also: Takens embedding) of a univariate time series 𝑥1 , 𝑥2 , …, 𝑥𝑁 . It considers the series of 𝑚-dimensional states: 𝑥𝑡 ⎞ ⎛ 𝑥𝑡+𝜏 ⎟ ⎜ 𝑧⃗1 = ⎜ 𝑥𝑡+2𝜏 ⎟ , ⎜ … ⎟ ⎝𝑥𝑡+(𝑚−1)𝜏 ⎠ 𝑥𝑡+1 ⎞ ⎛ 𝑥𝑡+1+𝜏 ⎟ ⎜ 𝑧⃗2 = ⎜ 𝑥𝑡+1+2𝜏 ⎟ , … ⎟ ⎜ 𝑥 ⎝ 𝑡+1+(𝑚−1)𝜏 ⎠ 24 … University of Cologne Biological Physics II A central result is Takens’ embedding theorem (and its extensions, in particular the fractal embedding theorem by Sauer, Yorke and Casdagli [29]), which give us that such a delay embedding yields a structure that is topologically equivalent to the actual attractor of the system and preserves properties such as Lyapunov exponents – for sufficiently high 𝑚 and properly chosen 𝜏 . For example, we can reconstruct the Lorenz attractor with 𝑚 = 3 and 𝜏 = 0.08: Despite the shortcomings of a 2D projection, we can directly see that the two-hole structure with respective regions of plain oscillation and merging is preserved: original reconstruction (𝜏 = 0.08) 50 20 40 10 30 0 20 −10 10 0 −20 −10 0 10 20 −20 0 20 −20 −20 −10 20 0 0 10 −20 20 From a proper delay embedding, we can directly get an idea about the central mechanisms of a dynamics, we have a tool to predict the dynamics, and we can perform further analysis such as calculating Lyapunov exponents. However, with many of these approaches, we must take care that our results are not confounded by noise. An important technique for this are surrogate time series, which are constructed as follows: 1. Fourier-transform the time series. 2. Randomise the phases in the Fourier transform, but keep the frequencies. 3. Apply an inverse Fourier transformation to obtain the surrogate time series. This surrogate time series retains all linear properties of the original time series, as they are completely captured by the frequency spectrum. However the process destroys all nonlinear properties, in particular chaos. We can then apply our analysis chain (e.g., phase-space reconstruction and computing Lyapunov exponents) to an ensemble of surrogate time series. If a result persists for the surrogates, we may not attribute it to chaos, but must assume that they are caused by noise or shortcomings of our analysis procedure. 25 University of Cologne Biological Physics II 2 Fluctuations in biological processes In biological systems, the relevant players (molecules, cells, individuals) are often present in small numbers. This implies that the deterministic equations that describe the time evolution of continuous variables we used throughout section 1 are only approximately valid. More precisely, the variables that specify the state of a biological system are often discrete numbers. For example, the number of copies 𝑁 of a protein in a cell is an integer number 𝑁 ≥ 0. The same holds for the number of individual organisms in a population. When these numbers are small, fluctuations can become important and must be taken into account to understand the dynamics of these systems. For a useful introduction to this problem and its mathematical treatment, see also [26]. 2.1 Random processes in cells Reactions between molecules and the initiation of biological processes typically require the relevant molecules to collide. Examples are an enzyme that needs to collide with and bind to a metabolite to catalyze a reaction that modifies this metabolite or the initiation of transcription, which requires RNAP to bind to the relevant promoter region in the DNA. Since most molecules inside a cell move around by diffusion and their collisions with diverse other molecules are random, it is a plausible assumption that these events are completely random and memoryless. 2.1.1 Poisson processes Indeed, many events in biological systems can be described as Poisson processes. We consider a sequence of events that occur at random times 𝑡1 , 𝑡2 , … in a time interval 𝑇 . We split the time interval 𝑇 into many small time slots of size 𝛥𝑡. We make 𝛥𝑡 small enough, so that each time slot contains at most a single event. For a Poisson process, the probability for an event to occur in each short time slot is 𝜉 = 𝛽𝛥𝑡, independent of what happens in any other time slot. The constant 𝛽 is the mean rate of the Poisson process. In this construction, 𝛥𝑡 must be chosen small enough, so that 𝛽𝛥𝑡 ≪ 1; the Poisson process is defined by the limit 𝛥𝑡 → 0. Thus, a Poisson process consists of events that happen at a certain typical rate on average but the exact event times are completely random: Different events have no effect on each other. Example: Radioactive decay. 26 University of Cologne Biological Physics II Example: Bortkiewicz’s analysis of deaths due to horse kicks. In the 19th century, the Prussian military kept meticulous records of the number of soldiers killed by being kicked by a horse each year over a 20-year period. Bortkiewicz showed that those tragic events obey a Poisson distribution. This classical example highlights that even complex situations can often be described as Poisson processes to a very good approximation. Waiting time distribution: The time interval 𝑡w between two events is called the waiting time. For a Poisson process, the waiting times 𝑡w follow an exponential distribution: 𝑝(𝑡w , 𝛽) = 𝛽𝑒 −𝛽𝑡w . This result implies that the average waiting time is ∞ ∞ 0 0 ⟨𝑡w ⟩ = ∫ d𝑡w 𝑡w 𝑝(𝑡w , 𝛽) = ∫ d𝑡w 𝑡w 𝛽𝑒 −𝛽𝑡w = 1/𝛽. As expected, the average waiting time is just the inverse of the mean rate of the Poisson process. 2 ⟩ − ⟨𝑡 ⟩2 = 𝛽 −2 . The variance of the waiting times is ⟨𝑡w w The waiting time distribution can often be observed (i.e., sampled) experimentally. An exponential waiting time distribution then indicates an underlying Poisson process. Example: Cell-fate decision making in B. subtilis. The bacterium B. subtilis can switch between a motile and a sessile state. Experiments that followed individual cells over long times in a microfluidics device showed that the waiting-time distribution for switching into the motile state is exponential, suggesting that a memoryless Poisson process controls this switch. In contrast, the time spent in the sessile state is more tightly controlled [27]. Example: Cell death caused by acid stress. E. coli bacteria growing in a microfluidics chamber that are suddenly exposed to a high concentration of an acid die after variable times. These survival times are exponentially distributed [24], suggesting again that cell death is caused by memoryless random events that can be described as a Poisson process. Distribution of counts: For a Poisson process with rate 𝛽, the probability of observing 𝑙 events in a time interval 𝑇 is given by the Poisson distribution: 𝑃Poisson (𝑙, 𝛽𝑇 ) = 𝑒 −𝛽𝑇 27 (𝛽𝑇 )𝑙 𝑙! University of Cologne Biological Physics II Key properties of Poisson processes Thinning property: A new random process generated by removing each event generated by a Poisson process with probability 𝜉thin is also a Poisson process with a (lower) mean rate (1 − 𝜉thin )𝛽. Merging property: Merging two Poisson processes with mean rates 𝛽1 and 𝛽2 , i.e., treating each event generated by either process as an event of the new merged process, results again in a Poisson process with rate 𝛽1 + 𝛽2 . 2.1.2 Birth-death process The birth-death process is a simple model of transcription that captures molecule number fluctuations. Transcripts (mRNA) of a gene are produced and degraded. Production is a zeroth-order reaction because it does not depend on the current state of the system (the molecule numbers present). Degradation is a first-order reaction: Its mean rate is proportional to the number of mRNA molecules. Both production and degradation are described as Poisson processes. The mean degradation rate 𝛽d is proportional to the number of mRNA molecules 𝑙: 𝛽d = 𝑘d 𝑙. At discrete time points, the molecule number suddenly increases or decreases by one; most of the time, nothing happens. We can write down the probability 𝑃(𝑙𝑖+1 |𝑙𝑖 ) that the system reaches 𝑙𝑖+1 mRNA molecules in the short time interval 𝛥𝑡, if it currently has 𝑙𝑖 mRNA molecules: 28 University of Cologne Biological Physics II ⎧𝛽s 𝛥𝑡 ⎪𝑘 𝑙 𝛥𝑡 𝑃(𝑙𝑖+1 |𝑙𝑖 ) = d 𝑖 ⎨1 − (𝛽s + 𝑘d 𝑙𝑖 )𝛥𝑡 ⎪ ⎩0 if 𝑙𝑖+1 = 𝑙𝑖 + 1 (synthesis) if 𝑙𝑖+1 = 𝑙𝑖 − 1 (degradation) if 𝑙𝑖+1 = 𝑙𝑖 (nothing happens) otherwise (1) Starting from an initial state 𝑙0 , this determines the probability distribution at any later time. The birth-death process has the Markov property: The current state of the system completely determines the probability distribution function of possible states at later times (i.e., the prior history of the system does not affect the dynamics). Continuous deterministic limit: For large molecule numbers 𝑙, changes of 𝑙 by one are negligible and we can treat it as a continuous variable. In this limit, we can calculate ⟨𝑙𝑖+1 ⟩, i.e., the expectation of 𝑙𝑖+1 , to see that eq. (1) becomes d𝑙 = 𝛽s − 𝑘d 𝑙. d𝑡 Stationary state: For 𝑡 → ∞, the birth-death process reaches a stationary state in which fluctuations of 𝑙 about the stable fixed point 𝑙∗ = 𝛽s /𝑘d of the deterministic system still occur, but the probability distribution for 𝑙 is constant in time. In this stationary state, the probability of observing different values of 𝑙 is given by a Poisson distribution 𝑃Poisson (𝑙, 𝛽s /𝑘d ) = 𝑒 −𝛽s /𝑘d (𝛽s /𝑘d )𝑙 . 𝑙! Master equation: To obtain an equation that describes the continuous time evolution of 𝑃(𝑙), we take the limit 𝛥𝑡 → 0 for the birth-death process defined in Eq. (1). d𝑃(𝑙) = 𝛽s 𝑃(𝑙 − 1) − 𝛽s 𝑃(𝑙) + 𝑘d (𝑙 + 1)𝑃(𝑙 + 1) − 𝑘d 𝑙𝑃(𝑙). d𝑡 The first two terms on the right capture the change of the probability of 𝑙 molecules being present due to synthesis, the last two terms capture degradation. An equation of this type is called (chemical) master equation. 29 University of Cologne Biological Physics II 2.1.3 Master equation and Gillespie algorithm In general, the master equation is a set of first-order differential equations that describes how the probability 𝑃𝑘 of a system to occupy each of its different states changes over time: d𝑃𝑘 = ∑ (𝛽𝑘𝑙 𝑃𝑙 − 𝛽𝑙𝑘 𝑃𝑘 ) . d𝑡 𝑙≠𝑘 The Gillespie algorithm is a simple and efficient way of solving a stochastic system like the birth-death process numerically. The procedure is straightforward: 1. Draw the waiting time for the next event from the exponential distribution with rate 𝛽tot , i.e., the sum of the rates of all reactions that can take place in the current state. 2. Draw a uniformly distributed random number between 0 and 1 to determine which reaction happened. Here, the probability for each reaction 𝑘 with mean rate 𝛽𝑘 is 𝛽𝑘 /𝛽tot . 3. Update the state variables and recalculate the mean rates. 4. Repeat. The Gillespie algorithm yields statistically correct trajectories (exact possible solutions) of the master equation. Probability densities and statistical properties of the solutions can be calculated by simulating many stochastic trajectories. Example: For the birth-death process, 𝛽tot = 𝛽s + 𝑘d 𝑙 in step 1 and the probability for a synthesis or degradation event to occur is 𝛽s /𝛽tot and 𝑘d 𝑙/𝛽tot in step 2, respectively. Step 1 in practice: To draw a waiting time from the exponential distribution, we can draw a uniformly distributed random number between 0 and 1 and convert it to a waiting time by inverting the cumulative distribution function 𝐹 (𝑡w ) = 1−𝑒 −𝛽tot 𝑡w . 30 University of Cologne Biological Physics II Origin of the name of this algorithm: Gillespie derived this algorithm in 1976 to simulate well-stirred chemical reactions [17], but it actually applies to any continuous-time discrete-state Markov process. The algorithm was used many times long before that and published, e.g., by Doob already in 1944. This algorithm was undoubtedly also well understood since shortly after Markov processes were first introduced. We have only scratched the surface here; for more on the theoretical description of stochastic processes, see [18, 36]. 2.1.4 Processes consisting of many steps Given the prevalence of fluctuations in cellular process, it may seem that virtually no cellular process can be deterministic. On the other hand, we know that certain processes such as replicating a chromosome take a quite well-defined time (e.g., 40 minutes for E. coli). Replicating a chromosome consists of millions of reaction steps in which complementary bases are added one at a time to the new DNA strand. Each of these steps can be described as a Poisson process. Thus, the waiting time for the entire process results from adding up several million exponentially distributed waiting times with similar means. This becomes almost deterministic for the same reason that flipping a coin millions of times yields a total number of heads that is almost predictable: The randomness of the individual steps averages out. A general result for the completion time 𝑡c of such reactions consisting of 𝑛 steps is that 𝑟= var(𝑡c ) ⟨𝑡c2 ⟩ − ⟨𝑡c ⟩2 1 = ≥ . 𝑛 ⟨𝑡c ⟩2 ⟨𝑡c ⟩2 Here, 𝑟 is the randomness parameter. This relation can be used to identify the minimum number of rate-limiting steps involved in a reaction [25]. 2.2 Noise in gene expression Is the simple birth-death process from section 2.1.2 an accurate description of transcription in living cells? 2.2.1 Testing the birth-death process as a stochastic model of transcription The birth-death process makes several experimentally testable predictions. We focus on three predictions: 31 University of Cologne Biological Physics II 1. Upon induction of expression, the average transcript number per cell 𝑙 increases as 𝛽s ⟨𝑙(𝑡)⟩ = (1 − 𝑒 −𝑘d 𝑡 ) . 𝑘d 2. In the stationary state reached for 𝑡 → ∞, the transcript counts are Poisson distributed; this implies that var(𝑙) = ⟨𝑙⟩. The Fano factor var(𝑙)/⟨𝑙⟩ = 1. 3. Upon induction of expression, the fraction of cells with zero transcripts initially decreases exponentially with time: 𝑃(0) = exp(−𝛽s 𝑡). Counting mRNAs in (living) cells: To test such predictions of theoretical descriptions of stochastic effects in gene expression experimentally, we need to count molecule numbers in cells, ideally over time and while the cells are still alive. This is a challenging problem: Individual molecules are not easily visualized using microscopy. Still, there are several recently developed techniques that enable such measurements. To image a single mRNA inside a cell, we usually need ways to increase the signal-to-noise ratio (to distinguish the molecule of interest from fluorescence background). One way of achieving that is to get many fluorescent molecules to localize to the mRNA of interest. Techniques for counting mRNAs inside cells include smFISH, MS2-GFP, and single-cell RNA-seq, see lecture slides. In particular, the MS2-GFP technique can be used to observe the numbers of transcripts of a gene in individual living cells over time. It is thus ideally suited to test if the birth-death process from section 2.1.2 is a faithful description of the transcription process. The experimental data obtained with the MS2-GFP technique are clearly inconsistent with the last two predictions above [19]. Thus, transcription in bacteria cannot be accurately described as a simple birth-death process. Is there a slightly more complicated model of transcription that is consistent with these data? 2.2.2 Transcription as a bursting process The Fano factor greater than one could be explained if transcripts are not produced in multiples of one but rather in transcriptional bursts in which a larger number of transcripts is produced in a short time. Assuming for simplicity that each transcription event produces exactly 𝑚 transcripts, the expectation of 𝑙 will increase by 𝑚, while its variance should increase by 𝑚2 . Thus, the Fano factor increases with the burst size 𝑚. Even though the master equation cannot be easily solved for this case, it is possible to derive an expression for the expectation and the variance of the molecule 32 University of Cologne Biological Physics II number 𝑙 directly from the master equation: d⟨𝑙⟩ = 𝑚𝛽s − 𝑘d ⟨𝑙⟩, d𝑡 d var(𝑙) = 𝑚2 𝛽s + 𝑘d ⟨𝑙⟩ − 2𝑘d var(𝑙), d𝑡 var(𝑙) 𝑚 + 1 in the stationary state. … ⟹ = 2 ⟨𝑙⟩ Bursting model of transcription: In the more realistic bursting model of gene expression, we relax the assumption that the burst size is constant. In this model: • The gene switches spontaneously between on and off states. • Transcripts are produced as in the birth-death process, but only when the gene is in the on state. The transitions between on and off states are assumed to be Poisson processes. This was validated directly in experiments: The waiting time distributions for entering and leaving the on state are indeed exponential [19]. Importantly, the data of Golding et al. constrain the parameters of the bursting model [19]. In particular: 1. From the waiting time distribution, 𝛽on ≈ 1/(37 min). 2. The average burst size is determined by the Fano factor in the stationary state and is 𝑚 ≈ 5. With these parameters constrained there are two additional testable predictions for the dynamics following the induction of gene expression (i.e., starting at zero mRNAs per cell): 1. As cells that switch to the on state almost immediately produce a transcript, 𝑃(0) = exp(−𝛽on 𝑡) in this model, which agrees with experimental observations. 𝑚𝛽 2. The average number of transcripts increases as ⟨𝑙(𝑡)⟩ = 𝑘 s (1 − 𝑒 −𝑘d 𝑡 ) . Here, d the only remaining parameter is 𝑘d . The best fit to the experimental data is obtained for 𝑘d = 0.014 min−1 . But since the transcript is essentially stable in this experiment, 𝑘d should result from dilution at cell division. The bacteria were dividing every 𝑇 = 33 University of Cologne Biological Physics II 50 min in this experiment, which indeed leads to an effective degradation rate 𝑘d = ln 2/𝑇 ≈ 0.014 min−1 . The bursting model is consistent with many key features of gene expression but certainly not a complete description of this process. Bursting was observed in diverse other systems, including in eukaryotes. However, not all genes burst: Bursting is a controlled feature. The molecular mechanisms that underlie bursting are usually not understood. For example, the packing and unpacking of DNA (supercoiling) could underlie this phenomenon. 2.2.3 Gene expression model including translation The translation of transcripts into proteins can be easily included as another process in the birth-death model. Key results from this model with translation: 1. Bursting also occurs at the level of proteins, specifically in the parameter regime where the mRNA molecule number per cell 𝑙 fluctuates between one and zero over time. 2. Time-averaging can reduce any strong fluctuations in mRNA copy number 𝑙 (since the lifetime of proteins is typically much greater than that of mRNAs). Thus, the variability of the protein copy number 𝑛 can be lower than that of the mRNA. It can be shown that, in the stationary state, (p) 𝑘d var(𝑛) 1 1 + . = 2 ⟨𝑛⟩ ⟨𝑙⟩ 𝑘 (m) + 𝑘 (p) ⟨𝑛⟩ d d Here, the second term shows how low mRNA molecule numbers lead to increased variability of the protein numbers. However, these fluctuations are (p) (m) increasingly averaged out if 𝑘d ≫ 𝑘d , i.e., if the lifetime of the protein is much greater than that of the mRNA. 3. The stationary protein copy number distribution (with some approximations) is given by [23] 𝑛𝑎−1 𝑒 −𝑛/𝑏 𝑃(𝑛) = 𝑎 . 𝑏 𝛤 (𝑎) Here, 𝑛 is the protein copy number, 𝑏 is the burst size (equal to the Fano factor), 𝑎 is the burst frequency (i.e., the number of bursts per cell generation) and 𝛤 is the gamma function. This result agrees well with experimental observations [23, 34]. 34 University of Cologne Biological Physics II Detecting individual fluorescent proteins in living cells. Idea: Tag the molecule of interest with a single fluorescent molecule and image for a long time to achieve a sufficiently high signal-to-noise ratio. A problem with this approach is that molecules in the cytosol of living cells quickly move around by diffusion. The diffusion coefficient for a fluorescent protein the size of GFP in the cytosol is 𝐷 ≈ 100 μm2/s. Thus, the time scale for diffusion across a bacterium of size 𝐿 ≈ 1 μm is 𝜏D = 𝐿2 /𝐷 ≈ 0.01 s. To emit sufficiently many photons for detection, fluorescent proteins typically need to be imaged for seconds. Thus, diffusion needs to be slowed down somehow; see lecture slides for different techniques. Once the location of a fluorescently tagged molecule is frozen, it can be detected and localized more precisely than the diffraction limit. A remaining technical problem is photo-bleaching: The fluorescent molecule is damaged (due to energy transferred in excitation) and ceases to fluoresce. Fluorescent proteins such as GFP and its derivatives bleach relatively easily. 2.2.4 Functional relevance of gene expression noise There is considerable evidence that fluctuations in gene expression are not just an inevitable side-effect of low molecule copy numbers inside cells, but can provide benefits to organisms that were selected in evolution. Bet hedging: Populations of genetically identical cells may exploit such fluctuations to create phenotypic heterogeneity and diversify the population. This can be beneficial when unpredictable changes occur in the environment, which is the case for free-living microbes. Example: Bacterial persistence. In a growing population of clonal bacteria, a small fraction of the bacteria switches into a non-growing persister state. Bacteria in this state are protected from antitbiotics and other stressors. Thus, the survival chances of the population as a whole (and its genotype) are increased by this diversification. Switching into and out of the persister state appears entirely random [4]. While the molecular mechanisms are complicated and not fully understood, switching is almost certainly caused by fluctuations in gene expression. Division of labor. Gene expression fluctuations can cause genetically identical cells to become phenotypically different and perform different tasks with a fitness benefit for the population as a whole. Example: Salmonella infection. When Salmonella bacteria infect the gut of a mammalian host, they randomly split into two sub-populations, in which the gene tts-1 is either on or off, respectively. One of the sub-populations invades the host cells and thus causes the infection; this lowers the survival of this sub-population. However, the other sub-population benefits from the infection and grows faster because of it. Overall, this enables these bacteria to proliferate faster and increase their chances of transmission to a new host [2, 1]. 35 University of Cologne Biological Physics II 3 Biological pattern formation Spatial patterns occur all over biology, from the stripes on a zebra to those on the wings of a butterfly. Biological patterning also occurs during animal development where initially identical cells acquire different fates, dependent on their position in a tissue. How do such patterns come about? Reaction-diffusion systems. of the form These are systems of partial differential equations 𝜕𝑡 𝑞⃗ = 𝐷∇2 𝑞⃗ + 𝑓⃗(⃗𝑞), where 𝑞⃗(𝑥, ⃗ 𝑡) = (𝑞1 (𝑥, ⃗ 𝑡), 𝑞2 (𝑥, ⃗ 𝑡), … , 𝑞𝑁 (𝑥, ⃗ 𝑡)) depends on space and time and 𝐷 is a diagonal matrix of diffusion coefficients. Reaction-diffusion systems are a natural description of chemical reactions, but can also describe various nonchemical processes. Example: Fisher’s equation. 𝜕𝑡 𝑞 = 𝐷𝜕𝑥2 𝑞 + 𝑞(1 − 𝑞) with the population size 𝑞(𝑥, 𝑡) in one space dimension. Locally, the dynamics corresponds to the logistic growth equation. In addition, there is non-directional spreading due to the diffusion term. This equation is used to describe the spread of biological populations. It has traveling wave solutions. 3.1 Self-organized patterns in reaction-diffusion systems How can regular patterns (stripes, spots, etc.) emerge spontaneously without an external source of spatial information? 3.1.1 Turing instability First, consider a simple toy model [28] of two molecular species that react with each other: 𝑋̇ = 5𝑋 − 6𝑌 + 1 𝑌 ̇ = 6𝑋 − 7𝑌 + 1. The production of both molecules is stimulated by 𝑋 and their degradation is proportional to 𝑌 . Hence, we call 𝑋 the activator and 𝑌 the inhibitor. The system has a fixed point at 𝑋 = 𝑌 = 1. Linear stability analysis shows that this fixed point is stable. Now, consider two cells next to each other following the same dynamics. The only difference is that the molecules can now diffuse freely between the two cells. 36 University of Cologne Biological Physics II Still, (𝑋1 , 𝑌1 , 𝑋2 , 𝑌2 ) = (1, 1, 1, 1), i.e., uniform concentrations in both cells, is a fixed point. If anything, diffusion should even out any differences in concentration between the two cells. Hence, we expect a stable steady state with uniform concentrations of both molecules. Our intuition is strikingly wrong: A Turing instability leading to different concentrations of the molecules in the two cells in steady state can occur if the inhibitor diffuses faster than the activator! In the simple toy model above, the concentration can then become negative. In a more realistic model, nonlinearities would prevent that, but the system would still go to a steady state with different concentrations in the two cells. 3.1.2 Turing patterns in one dimension A Turing instability can lead to the emergence of spatially periodic patterns from uniform initial conditions [35]. Spontaneous stripe formation. Consider a one-dimensional chain of cells. The molecules 𝑋 and 𝑌 diffuse freely between neighboring cells. In addition, the same reactions that lead to the production and degradation of these molecules take place in all cells. In the absence of diffusion, the concentrations in cell 𝑟 obey 𝑋𝑟̇ = 𝑓 (𝑋𝑟 , 𝑌𝑟 ) 𝑌𝑟̇ = 𝑔(𝑋𝑟 , 𝑌𝑟 ) where 𝑓 and 𝑔 are nonlinear functions that describe the reactions in each cell. We assume that, in the absence of diffusion, (𝑋𝑟 , 𝑌𝑟 ) = (ℎ, 𝑘) is a fixed point. Analyzing the stability of the uniform steady state in the presence of diffusion shows that this state can become unstable and a wave pattern with a preferred wavelength can spontaneously emerge if one of the molecules is an inhibitor that diffuses faster than the other molecule. Thus, a Turing instability can lead to the spontaneous emergence of a stripe pattern. The wavelength 𝜆0 of the pattern scales as the square root of the diffusion coefficient of the slowly diffusing activator: 𝜆0 ∼ √𝐷𝑋 . 37 University of Cologne Biological Physics II In general, two features are crucial for pattern formation in this way: Local self-enhancement and long-range inhibition [22]. 3.1.3 Activator-inhibitor systems in two dimensions In two space dimensions, the analysis becomes more complicated. In this case, reaction-diffusion systems with a Turing instability are best analyzed numerically (Turing himself already did this). Simple activator-inhibitor system [22]. 𝑎2 − 𝑘a 𝑎 + 𝜈 a (𝐾a2 + 𝑎2 )ℎ 𝜕𝑡 ℎ = 𝐷h ∇2 ℎ + 𝜌h 𝑎2 − 𝑘h ℎ + 𝜈h , 𝜕𝑡 𝑎 = 𝐷a ∇2 𝑎 + 𝜌a with degradation rates 𝑘a and 𝑘h , cross-reaction coefficients 𝜌a and 𝜌h , and basal production rates 𝜈a and 𝜈h . The activator concentration is 𝑎(𝑥, ⃗ 𝑡), the inhibitor concentration ℎ(𝑥, ⃗ 𝑡). In this system, the formation of stable patterns via a Turing instability requires • 𝐷h ≫ 𝐷a , i.e., local activation and long-range inhibition. • 𝑘h ≫ 𝑘a , i.e., the (degradation) kinetics of the inhibitor needs to be fast. Otherwise, this system can produce global oscillations or traveling waves [22]. Intuitive picture of pattern formation. Consider a domain that is patterned starting with a small local perturbation of the homogeneous state 𝑎 = ℎ = 0. The pattern forms like this: • A peak of 𝑎 forms in some region as 𝑎 amplifies its own production (directly or indirectly) and diffuses slowly. • At some distance from the 𝑎 peak, the fast-diffusing inhibitor will dominate and switch off the production of 𝑎 (and ℎ). • Even farther away from the 𝑎 peak, ℎ will drop to low values (since it is only produced in the region where 𝑎 is high and degraded at a high rate 𝑘h ≫ 𝑘a everywhere), leading to a new region where 𝑎 amplifies itself to a peak. 38 University of Cologne Biological Physics II If the pattern is initiated by small random fluctuations, irregularly arranged peaks of 𝑎 form. However, each peak maintains a minimum distance from neighboring peaks (due to lateral inhibition), leading to an approximately hexagonal arrangement of the peaks. Animal coat patterns. With the right choice of reactions and parameters, the standing waves created by Turing instabilities can lead to regular patterns of stripes or spots (the latter usually in an approximately hexagonal order). Patterns resembling animal coats, such as those of zebras, tigers, skunks, leopards, or cows can be reproduced by Turing-type reaction-diffusion systems [3]. Dynamical patterns. New features appear in Turing-like systems with three or more different types of molecules: Dynamic patterns like spirals can emerge. A famous example is the Belousov-Zhabotinsky reaction. Conclusion. Reaction-diffusion systems with a Turing instability can create patterns similar to those observed in biology. However, the molecular mechanisms for the formation of animal coat patterns are generally unknown and certainly not consistent with a classical Turing mechanism: Among other issues, the stripes, spots etc. on animal coats are simply too large to be generated by diffusion. Also, some examples are known where the migration of cells carrying pigments plays a role [28]. However, even though the microscopic mechanism may be completely different (e.g., diffusion of molecules may need to be replaced with random migration of pigment-carrying cells), the mathematical description and concept of the Turing instability likely still apply. 3.2 Morphogen gradients Even though Turing introduced the term morphogen for the diffusing molecules that control patterning, patterns in animal development generally form by a different mechanism. Cells in a developing tissue that are initially identical copies of each other need to differentiate into cells with different functions (e.g., grow a bristle, become a nerve cell, a liver cell, etc.). Morphogen gradients are a particularly simple case of a reaction-diffusion system where cells obtain spatial information from the concentration of a diffusible signaling molecule. 39 University of Cologne Biological Physics II 3.2.1 French flag model and positional information The morphogen forms a graded concentration profile across a tissue. Cells detect (»read«) the morphogen with receptors on their surface and respond to it in a concentration-dependent manner: They differentiate into different cell fates depending on the local morphogen concentration they are exposed to. In this way, positional information about the distance from the morphogen source is encoded in the morphogen concentration [39]. In the French flag model, cells differentiate into the blue state if the morphogen concentration 𝑐 exceeds a threshold 𝑐1 , into the white state if 𝑐2 < 𝑐 < 𝑐1 for a lower threshold 𝑐2 , and into a red state if 𝑐 < 𝑐2 . In this way, a stripe pattern with three different cell fates is generated in the target tissue. Importantly, there is considerable evidence that this is how tissue patterning works in reality, at least in essence. In two-dimensional or three-dimensional tissues, multiple orthogonal morphogen gradients can provide a coordinate system that, in principle, can specify the position of each cell. 3.2.2 Diffusion-degradation model The morphogen is produced in a source region, spreads non-directionally with (effective) diffusion coefficient 𝐷, and is degraded everywhere at rate 𝑘. The dynamics of the morphogen concentration 𝑐(𝑥, ⃗ 𝑡) and the expression levels 𝑞⃗(𝑥, ⃗ 𝑡) of the target genes, which are transcription factors that control the cell fate, is given by: 𝜕𝑡 𝑐 = 𝐷∇2 𝑐 − 𝑘𝑐 𝜕𝑡 𝑞⃗ = 𝑓⃗(𝑐, 𝑞⃗). This model is relatively simple in that the target genes respond to the morphogen but do not diffuse and do not affect the morphogen. Thus, the equation for 𝑐 decouples and can be investigated separately. 40 University of Cologne Biological Physics II We focus on the one-dimensional case: 𝜕𝑡 𝑐 = 𝐷𝜕𝑥2 𝑐 − 𝑘𝑐. A morphogen source that produces morphogens at a constant rate located at 𝑥 = 0 is described by imposing a current 𝑗0 at 𝑥 = 0: −𝐷𝜕𝑥 𝑐|𝑥=0 = 𝑗0 . Steady state concentration profile. For 𝑡 → ∞, an exponential concentration profile 𝑐(𝑥) = 𝑐0 exp(−𝑥/𝜆) with characteristic decay length scale 𝜆 = √𝐷/𝑘 is reached. Examples for typical values of the decay length: • 𝜆 ≈ 100 μm for the morphogen Bicoid in the Drosophila embryo. • 𝜆 ≈ 20 μm for Dpp in the Drosophila wing disk (the precursor of the fruit fly wing). • 𝜆 ≈ 20 μm for Shh and BMP in the neural tube (the precursor of the brain and neural system in mammals). Time-dependence of concentration profile. A typical value of the diffusion coefficient for a morphogen protein in water is 𝐷 ≈ 10 μm2/s. Thus, spreading of the morphogen across 𝐿 = 100 μm takes about 𝐿2 /𝐷 ≈ 103 s. It would take too long to pattern much larger fields using diffusing morphogens. This is certainly part of the reason why tissue patterning occurs early in animal development when tissues are still small. 3.2.3 Scaling of pattern with tissue size Animals and tissues vary in size, especially as they grow during development. Cell differentiation patterns generally scale with the size of the issue, i.e., stripes etc. are formed at a relative position (e.g., in the middle) of the tissue, irrespective of its absolute size. Proportions within and between tissues are preserved. The diffusion-degradation model cannot easily explain this phenomenon since the length scale of the pattern is set by the diffusion and degradation constants. Source-sink model. Localized degradation of the morphogen in a sink region could in principle generate a linear concentration profile that scales perfectly with tissue size. However, neither a morphogen that is degraded in a localized sink, nor a concentration profile that is linear has been observed so far. 41 University of Cologne Biological Physics II Opposing morphogen gradients. Scaling of the tissue pattern could be achieved with two morphogens produced in sources at opposite ends of the tissue. The cells in the target tissue could then make cell fate decisions based on both morphogen signals. In this scenario, at least one position (e.g., the middle of the tissue) can scale perfectly with tissue size. Scaling elsewhere can also be improved: in essence, the cells need to »decode« two morphogen concentrations 𝑐1 , 𝑐2 to estimate their position 𝑥(𝑐1 , 𝑐2 ). For example, this scenario may apply to the neural tube, where two morphogens (Shh and BMP) are produced at the dorsal and ventral end of the tissue [40]. In general, it is an open question how scaling is achieved. 3.2.4 Robustness of morphogen gradients Just like engineered systems, biological systems should be robust to perturbations that can occur easily and would perturb the function of the system. It has been observed that the production rate of different morphogens in the source often varies considerably among individuals. At the same time, the resultant morphogen concentration profile and tissue pattern vary much less. Robustness of the linear diffusion-degradation model. Increasing the current 𝑗0 at the boundary by 𝛿𝑗0 leads to a linear increase of the steady state concentration everywhere. As a result, all concentration thresholds defining the boundaries between cell fates shift by 𝛿𝑥 = 𝜆 ln (1 + 𝛿𝑗0 ). 𝑗0 How can the system be made more robust to such variations? Nonlinear degradation. We can describe self-amplified degradation of the morphogen with a nonlinear degradation term: 𝜕𝑡 𝑐 = 𝐷𝜕𝑥2 𝑐 − 𝜅𝑐 𝑛 . Self-amplified degradation can easily occur in biological systems. For example, degradation of the morphogen often occurs inside cells only after binding to a cell surface receptor and internalization. Morphogens are known to trigger upregulation of receptor production. This provides a straightforward mechanism for self-amplified degradation. The steady state concentration profile for 𝑥 ≥ 0 with the current 𝑗0 > 0 imposed at 𝑥 = 0 to describe the morphogen source has the power-law form 𝑐(𝑥) = 𝐴 , (𝑥 + 𝛿)𝑚 42 University of Cologne Biological Physics II where 𝑚 = 2/(𝑛 − 1), 𝐴 is an amplitude that depends on the parameters 𝐷, 𝜅, 𝑛 and 𝛿 > 0 is constrained by the boundary condition 𝑗0 . This steady state morphogen profile becomes robust to changes of 𝑗0 for large values of 𝑗0 . The origin of this robustness is the singularity at 𝑥 = −𝛿: Increasing 𝑗0 corresponds to shifting the steady state profile 𝑐(𝑥) to the right until 𝑗0 = −𝐷𝜕𝑥 𝑐|𝑥=0 . As the singularity is approached, the shift 𝛿𝑥 of the profile becomes tiny for any change 𝛿𝑗0 , leading to 𝛿𝑥 ≈ 0. Thus, nonlinear degradation yields steady state profiles that can be made almost perfectly robust to variations in morphogen production [14]. Nonlinear diffusion. Similarly, robustness can be achieved if the effective diffusion coefficient depends on the morphogen concentration. 𝜕𝑡 𝑐 = 𝜕𝑥 𝐷(𝑐)𝜕𝑥 𝑐 − 𝑘𝑐. For example, the effective diffusion coefficient can become concentration-dependent if morphogens are actively transported through cells in a way that requires binding to a receptor (planar transcytosis) [7]. Then, 𝐷(𝑐) ∼ 𝑐 −2 for large 𝑐, i.e., transport becomes slow as the receptors needed for transport are increasingly occupied. This also leads to a singularity in the steady state profile and high robustness to source fluctuations. 3.2.5 Precision of morphogen gradients In a biological tissue, low molecule numbers of the morphogen and cell-to-cell variability that affects morphogen transport will lead to random variability in the morphogen concentration in time and space. Thus, the local morphogen concentration 𝑐(𝑥) has an error that can be quantified as the standard deviation 𝜎c (𝑥). This error 𝜎c (𝑥) leads to a positional uncertainty 𝜎x about the distance from the morphogen source: At best, a cell detecting the morphogen concentration can determine its position with this uncertainty. 43 University of Cologne Biological Physics II The positional uncertainty is inversely related to the local steepness of the morphogen concentration profile. In a linear approximation, an error 𝛿𝑐 in the morphogen concentration will lead to an error 𝑥(𝑐 + 𝛿𝑐) ≈ 𝑥(𝑐) + d𝑥 𝛿𝑐 d𝑐 in the inferred position of the cell. If the average concentration is 𝑐0 (𝑥) = ⟨𝑐(𝑥)⟩ we have for the positional uncertainty 𝜎 𝜎x = d𝑐c . | d𝑥0 | For an exponential concentration profile, the positional uncertainty is proportional to the variation coefficient of the morphogen concentration 𝛴: 𝜎 (𝑥) = 𝜆𝛴(𝑥). 𝜎x = 𝜆 c 𝑐0 (𝑥) How does the positional uncertainty depend on different sources of fluctuations and on the distance from the morphogen source? Diffusion-degradation model with frozen disorder. We describe the effects of cell-to-cell variability in the morphogen degradation rates and the diffusion coefficient in a diffusion-degradation model with frozen disorder. The degradation rate and diffusion coefficient fluctuate in space. The morphogen current at the source boundary also fluctuates. Our goal is to calculate the effects of these fluctuations on the steady state concentration profiles. Lattice model. A discrete version of this model on a one-dimensional lattice: 𝐶𝑛̇ = 𝑝𝑛−1 (𝐶𝑛−1 − 𝐶𝑛 ) + 𝑝𝑛 (𝐶𝑛+1 − 𝐶𝑛 ) − 𝑘𝑛 𝐶𝑛 , for 𝑛 > 0, where 𝐶𝑛 denotes the particle number at lattice site 𝑛, 𝑝𝑛−1 is the rate for particle hopping between sites 𝑛 and 𝑛 − 1 in both directions, and 𝑘𝑛 is the rate 44 University of Cologne Biological Physics II of particle decay at site 𝑛. The lattice begins at site 𝑛 = 0. We will investigate a constant influx 𝜈 of particles into the system which is described by 𝐶0̇ = 𝜈 + 𝑝0 (𝐶1 − 𝐶0 ) − 𝑘0 𝐶0 . The 𝑝𝑛 are distributed randomly around a mean value 𝑝, i.e., 𝑝𝑛 = 𝑝 + 𝜂𝑛 . Here, 𝜂𝑛 is a random variable with ⟨𝜂𝑛 ⟩ = 0 and ⟨𝜂𝑛 𝜂𝑗 ⟩ = 𝜎𝑝2 𝛿𝑛𝑗 . This implies that the values of 𝜂𝑛 at different lattice sites are uncorrelated. The 𝑘𝑛 are distributed in the same way: 𝑘𝑛 = 𝑘 + 𝜁𝑛 with ⟨𝜁𝑛 ⟩ = 0 and ⟨𝜁𝑛 𝜁𝑗 ⟩ = 𝜎𝑘2 𝛿𝑛𝑗 . The rate of morphogen influx into the system 𝜈 also fluctuates, i.e., 𝜈 = 𝜈0 +𝜒 where 𝜒 is a random variable with ⟨𝜒 ⟩ = 0, ⟨𝜒 2 ⟩ = 𝜎𝜈2 . For simplicity, we assume that there are no correlations between 𝑘𝑛 , 𝑝𝑛 , and 𝜒 . Continuum limit. 𝜕𝑡 𝑐(𝑥, 𝑡) = 𝜕𝑥 [𝐷 + 𝜂(𝑥)] 𝜕𝑥 𝑐(𝑥, 𝑡) − [𝑘 + 𝜁 (𝑥)] 𝑐(𝑥, 𝑡), where the Stratonovich interpretation of the multiplicative noise terms has to be used, 𝑥 = 𝑛𝑎, 𝑐(𝑥, 𝑡)|𝑥=𝑛𝑎 = 𝐶𝑛 (𝑡)/𝑎, and 𝐷 = 𝑝𝑎2 with the lattice constant 𝑎. The continuum limit of the noise correlators is ⟨𝜂(𝑥)𝜂(𝑥 ′ )⟩ = 𝜎𝐷2 𝑎𝛿(𝑥 − 𝑥 ′ ), and ⟨𝜁 (𝑥)𝜁 (𝑥 ′ )⟩ = 𝜎𝑘2 𝑎𝛿(𝑥 − 𝑥 ′ ). Here, 𝜎𝐷 = 𝜎𝑝 𝑎2 . A fixed morphogen influx at site 𝑛 = 0 at rate 𝜈 = 𝜈0 + 𝜒 corresponds to a fixed current 𝑗0 + 𝜒 at 𝑥 = 0 with 𝑗0 = 𝜈0 . This imposes the constraint (𝐷 + 𝜂(𝑥)) 𝜕𝑥 𝑐(𝑥, 𝑡)| 𝑥=0 = −𝑗0 − 𝜒 . We calculate the steady state concentration profile in the half-space 𝑥 > 0. Steady state morphogen concentration fluctuations. A perturbative approach for small noise amplitudes yields that the variation coefficient 2 √⟨(𝑐(𝑥) − ⟨𝑐(𝑥)⟩) ⟩ 𝛴(𝑥) = ⟨𝑐(𝑥)⟩ of the morphogen concentration in steady state behaves as 1/2 𝛴(𝑥) = [𝛴B2 + 𝛴𝑘 (𝑥)2 + 𝛴𝐷 (𝑥)2 ] , with 𝜎𝑗 2 𝛴B2 = ( 0 ) 𝑗0 𝑎 𝜎𝑘 2 2𝑥 𝛴𝑘 (𝑥)2 = ( ) (3 − 𝑒 −2𝑥/𝜆 + ) 8𝜆 𝑘 𝜆 𝑎 𝜎𝐷 2 2𝑥 2 −2𝑥/𝜆 𝛴𝐷 (𝑥) = + ) ( ) (−1 + 3𝑒 8𝜆 𝐷 𝜆 The asymptotic behavior of 𝛴(𝑥) for large 𝑥 is: 𝜎𝑘 2 𝜎𝐷 2 𝑎𝑥 + [( ( ) ) ]. 𝑘 𝐷 4𝜆2 The variation coefficient of the concentration grows as 𝛴(𝑥) ∼ √𝑥. Thus, the relative variability of the concentration increases with increasing distance from the morphogen source in the same way as the typical distance traveled by a diffusing particle increases with time. 𝛴(𝑥)2 ≈ 45 University of Cologne Biological Physics II Two- and three dimensional cases. In two dimensions, the morphogen source is described as a line, e.g., the y-axis. The influx of morphogens fluctuates along this line: − [𝐷 + 𝜂(𝑥)] ⃗ 𝜕𝑥 𝑐(𝑥, ⃗ 𝑡)| = 𝑗0 + 𝜒 (𝑦). 𝑥=0 In three dimensions, the source is a surface, e.g., the yz-plane. Unlike in one dimension, the source fluctuations alone result in a 𝛴(𝑥) that decreases with increasing 𝑥: Spatial fluctuations in the morphogen source average out with increasing distance from the source. The combination of this effect with the fluctuation of degradation rate and diffusion coefficient, which result in an increase of 𝛴(𝑥) with 𝑥, lead to a minimum of 𝛴(𝑥) at a finite distance from the morphogen source, where the precision of tissue patterning is highest. Such a minimum has indeed been observed experimentally for the morphogen Dpp in the Drosophila wing disk [8]. An analogous perturbative approach as in the one-dimensional case shows that the variation coefficient of the morphogen concentration increases as 𝛴(𝑥) ∼ 𝑥 1/4 in two dimensions (𝑑 = 2) and as 𝛴(𝑥) ∼ ln(𝑥) in three dimensions (𝑑 = 3). In summary, we find 3−𝑑 𝛴(𝑥) ∼ 𝑥 𝜙 , with 𝜙 = . 4 This suggests that 𝑑 = 3 is the upper critical dimension beyond which the fluctuations of 𝐷 and 𝑘 become irrelevant. 46 University of Cologne Biological Physics II 4 Empirical laws in biology Empirical laws play a central role in physics. Examples are Ohm’s law, Stokes’s law, the ideal gas law, or Newton’s laws. These laws are simple, broadly valid, and enable quantitative predictions, even though they are often hard to derive from first principles and their microscopic origins are usually incompletely understood. While biology has been a less quantitative discipline so far, similar laws also exist in biology. In recent years, efforts by biological physicists led to the discovery of several new laws. These laws enable faithful quantitative predictions about biological systems. Here, we will introduce some of these laws, highlight their key applications and discuss potential origins. 4.1 Scaling laws in biology Scaling behavior is often considered in physics, e.g., how the time 𝜏 it takes for a molecule to move across space scales with the distance 𝐿. Certain biological phenomena can also be studied over a broad range of scales (e.g. the size of organisms). We will introduce some key examples of scaling laws in biology and will try to make sense of them. 4.1.1 Allometric scaling The (basal) metabolic rate is the power (energy per unit time) that an animal needs to keep its body functioning at rest. It includes all metabolic reactions, breathing, brain activity, heart beat, etc. The metabolic rate can be measured as heat production and compared for organisms of different sizes. Kleiber’s law. The metabolic rate 𝑃 of animals scales with their body mass 𝑀 as 𝑃 ∝ 𝑀 3/4 . Figure from [32]. There is no obvious explanation for the exponent 3/4. Indeed, for a long time it was thought that the data are consistent with an exponent 2/3, for which there would be a simple Euclidean scaling argument. 47 University of Cologne Biological Physics II Biological rates and time scales. Many biological rates 𝑘 (e.g., the heart rate or respiratory rate) scale with animal size as 𝑘 ∝ 𝑀 −1/4 . Further, biological time scales 𝜏 (e.g. life span) scale as 𝜏 ∝ 𝑀 1/4 . While there is no shortage of hypotheses for explaining these quarter power scaling laws [30], their origin has so far remained mysterious. 4.1.2 Scaling in the functional content of genomes The total number of genes in different organisms varies over several orders of magnitude: Viruses have just a handful of genes, while animals have at least tens of thousands. The number of genes in different functional groups appears to scale systematically with genome size [37]. In particular, across different bacteria, the number of genes coding for transcription factors 𝑁TF scales with the total number of genes 𝑁tot approximately as 2 . 𝑁TF ∼ 𝑁tot In contrast, the number of metabolic genes scales linearly [37]: 𝑁met ∼ 𝑁tot . Hence, with growing genome size, an increasing fraction of the genome is devoted to the regulation of gene expression. TF genes are shown in red, metabolic genes in blue. Figure from [37]. 4.2 Bacterial growth laws If the environment is kept constant, a steady state in which a population of cells grows exponentially is reached after some time. The growth rate that is reached strongly depends on the types and concentrations of nutrients in the environment. For example, E. coli bacteria grow faster if glucose is available as the only carbon source than for other sugars (such as lactose). Moreover, physical properties of the environment (such as temperature or pressure) affect the growth rate. Growth laws relate quantitative properties of the cell (usually about its macromolecular composistion) to its growth rate. Despite the complexity of cell physiology, many properties of the cell depend only on its growth rate, irrespective of the detailed composition of the nutrient environment. 48 University of Cologne Biological Physics II Example: Monod’s law. This classical example describes the dependence of the growth rate of microbes on the concentration of a limiting nutrient (substrate): 𝜆 = 𝜆max 𝑐S . 𝐾S + 𝑐S Here, 𝜆 denotes the growth rate, 𝜆max is the maximum growth rate at saturating concentration of the substrate, 𝑐S is the concentration of the substrate, and 𝐾S is the concentration of the substrate at which half-maximal growth is achieved. 4.2.1 Dependence of cellular DNA content on growth rate E. coli and other bacteria have a single circular chromosome. The replication of this chromosome is initiated at the origin of replication. Upon initiation, two replication forks move around the chromosome from the origin towards the terminus in opposite directions. Replication from origin to terminus (C period) takes 𝜏C ≈ 40 min. Upon termination, there is a delay 𝜏D ≈ 20 min till the next cell division (D period). At high growth rates, the generation time can be as low as 𝜏 ≈ 20 min. To achieve these generation times that are lower than the time it takes to replicate the chromosome, multiple pairs of replication forks run in parallel: Chromosome replication for the next but one cell division already starts before the next cell division has occurred. Cooper-Helmstetter law: In a population of bacteria in steady state exponential growth, the average amount of DNA ⟨𝐺⟩ per cell (in genome equivalents) depends on the generation time 𝜏 as [11, 9]: ⟨𝐺⟩ = 𝜏 (2(𝜏C +𝜏D )/𝜏 − 2𝜏D /𝜏 ) . 𝜏C ln 2 49 University of Cologne Biological Physics II The derivation assumes that the rate of DNA replication is constant along the chromosome, which is true to a very good approximation. This result agrees well with quantitative measurement of DNA content at different growth rates. Dependence of cell size on growth rate. There is a closely related result for the dependence of cell size on growth rate. Cell size is typically quantified as the dry mass, i.e., the cell mass excluding water. Cell volume is proportional to the dry mass. Proteins are the dominant contribution to the cell mass. Other molecules such as DNA, RNA, or lipids contribute little to the cell mass. Empirical observations for diverse organisms and cell types show that the cell size generally increases with growth rate and DNA content. The initiation mass mechanism postulates that DNA replication (in E. coli) is initiated when the cell mass per origin of replication reaches a threshold value 𝑀ini that is independent of growth rate [13]. This assumption agrees well with experimental data in populations and individual cells [38], even though the molecular mechanisms that lead to this phenomenon are still poorly understood. This assumption results in a law for the cell mass ⟨𝑀⟩ at different growth rates: ⟨𝑀⟩ = 𝑀ini 2(𝜏C +𝜏D )/𝜏 4.2.2 Dependence of cellular ribosome content on growth rate Ribosomes play a central role in growth since they produce all proteins in the cell. In E. coli, the concentration 𝑟 of ribosomes in the cell (measured as the total mass of RNA in the cell divided by the total protein mass) depends on the growth rate 𝜆 following the growth law 𝜆 𝑟 = 𝑟0 + (2) 𝜅t when 𝜆 is varied by changing the nutrient environment. Here, 𝑟0 is a y-axis intercept that corresponds to a fraction of ribosomes that do not contribute to growth and 𝜅t determines the slope. This growth law (except for the small y-axis intercept) can be derived from the assumptions that protein synthesis by ribosomes is growth limiting and that each ribosome produces protein mass at a constant translation rate. Translation mutants with a lower translation rate per ribosome obey the same growth law with a greater slope, i.e., the parameter 𝜅t is smaller. Indeed, 𝜅t correlates linearly with the translation rate of ribosomes in these mutants. Hence, 𝜅t can be interpreted as the translational capacity. 50 University of Cologne Biological Physics II Second growth law. When the translation rate per ribosome is reduced (e.g., using mutants or antibiotics that target the ribosome such as chloramphenicol or tetracycline), the ribosome concentration increases following the linear relation [31] 𝜆 𝑟 = 𝑟max − . (3) 𝜅n The y-axis intercept 𝑟max is largely independent of the nutrient quality of the growth medium and the specific way of translation inhibition. This indicates that there is a universal maximum for the fraction of the proteome (i.e., the total protein content of the cell) that can be allocated toward ribosomes. The inverse slope 𝜅n is strongly correlated with the growth rate in the antibiotic-free growth medium. Thus, 𝜅n can be interpreted as the nutritional capacity of the bacterium in a given growth medium. Together, these empirical laws suggest a simple model for the allocation of the proteome into ribosomal and other proteins. 4.2.3 Minimal model of proteome allocation Goal: Build a minimal model of proteome allocation that is consistent with the observed growth laws [31]. The simplest model would describe the proteome by two sectors: Ribosomeaffiliated proteins (R-proteins) and all other proteins. However, the empirical observation that the maximum fraction of R-proteins 𝜙R is 𝜙Rmax ≈ 0.55 (this follows from the value of 𝑟max ) suggests that the other proteins need to be subdivided into two separate sectors: Q-proteins with mass fraction 𝜙Q that is generally constant and P-proteins with mass fraction 𝜙P → 0 as 𝜙R → 𝜙Rmax . Thus, we split the proteome into three sectors (P, Q, and R) [31]. The mass fractions satisfy 𝜙P + 𝜙Q + 𝜙R = 1 and 𝜙P = 𝜙Rmax − 𝜙R = 𝜌 51 𝜆 . 𝜅n (4) University of Cologne Biological Physics II Here, 𝜌 is a constant for converting the total RNA/protein ratio 𝑟 into a mass fraction of ribosomal proteins: 𝜙R = 𝜌𝑟. Interpretation of the P-sector. We see from Eq. (4) that when the nutritional capacity 𝜅n is high, fewer P-proteins are needed to maintain the same growth rate. This relation suggests that P-proteins provide the nutrients needed for growth (i.e., these proteins include metabolic enzymes). Integrated growth model. Combining the constraints explained above, we can calculate the dependence of the growth rate on the nutrient capacity and the translational capacity: 𝜙 max − 𝜙0 𝜅n 𝜅t 𝜆(𝜅n , 𝜅t ) = R . 𝜌 𝜅n + 𝜅t For constant 𝜅t the well-known dependence on the nutrient level 𝜅n as in Monod’s law results. Here, 𝜅n captures the quality of the nutrient environment in a more general sense than in Monod’s law for the nutrient concentration. Analogy to electrical circuit. Funny enough, the problem is mathematically analogous to an electrical circuit with two resistors. References [1] M. Ackermann. A functional perspective on phenotypic heterogeneity in microorganisms. Nature Reviews Microbiology, 13(8):497–508, Aug. 2015. Publisher: Nature Publishing Group. [2] M. Ackermann, B. Stecher, N. E. Freed, P. Songhet, W.-D. Hardt, and M. Doebeli. Self-destructive cooperation mediated by phenotypic noise. Nature, 454(7207):987–90, Aug. 2008. 52 University of Cologne Biological Physics II [3] W. L. Allen, I. C. Cuthill, N. E. Scott-Samuel, and R. Baddeley. Why the leopard got its spots: relating pattern development to ecology in felids. Proceedings. Biological sciences, 278(1710):1373–80, May 2011. ISBN: 0962-8452. [4] N. Q. Balaban, J. Merrin, R. Chait, L. Kowalik, and S. Leibler. Bacterial persistence as a phenotypic switch. Science (New York, N.Y.), 305(5690):1622– 5, Sept. 2004. [5] G. Benettin, L. Galgani, A. Giorgilli, and J.-M. Strelcyn. Lyapunov characteristic exponents for smooth dynamical systems and for Hamiltonian systems; a method for computing all of them. Meccanica, 15(1):9–30, 1980. [6] L. Bintu, N. E. Buchler, H. G. Garcia, U. Gerland, T. Hwa, J. Kondev, R. Phillips, T. Kuhlman, and R. Phillips. Transcriptional regulation by the numbers: models. Current Opinion in Genetics & Development, 15(2):116–24, Apr. 2005. [7] T. Bollenbach, K. Kruse, P. Pantazis, M. González-Gaitán, and F. Jülicher. Robust formation of morphogen gradients. Physical review letters, 94(1):018103, Jan. 2005. ISBN: 0031-9007 (Print)\r0031-9007 (Linking). [8] T. Bollenbach, P. Pantazis, A. Kicheva, C. Bökel, M. González-Gaitán, and F. Jülicher. Precision of the Dpp gradient. Development (Cambridge, England), 135(6):1137–46, Mar. 2008. ISBN: 0950-1991 (Print)\n0950-1991 (Linking). [9] H. Bremer and G. Churchward. An examination of the Cooper-Helmstetter theory of DNA replication in bacteria and its underlying assumptions. Journal of theoretical biology, 69(4):645–654, Dec. 1977. ISBN: 0022-5193 (Print)\r0022-5193 (Linking). [10] S. Camalet, T. Duke, F. Jülicher, and J. Prost. Auditory sensitivity provided by self-tuned critical oscillations of hair cells. Proceedings of the National Academy of Sciences of the United States of America, 97(7):3183–8, Mar. 2000. ISBN: 0027-8424 (Print). [11] S. Cooper and C. E. Helmstetter. Chromosome replication and the division cycle of Escherichia coli. Journal of Molecular Biology, 31(3):519–540, Feb. 1968. ISBN: 0022-2836 (Print). [12] J. D. Crawford. Introduction to bifurcation theory. Reviews of Modern Physics, 63(4):991–1037, Oct. 1991. [13] W. D. Donachie. Relationship between cell size and time of initiation of DNA replication. Nature, 219(5158):1077–9, Sept. 1968. [14] A. Eldar, D. Rosin, B. Z. Shilo, and N. Barkai. Self-enhanced ligand degradation underlies robustness of morphogen gradients. Developmental Cell, 5(4):635–646, Oct. 2003. ISBN: 1534-5807. [15] M. B. Elowitz and S. Leibler. A synthetic oscillatory network of transcriptional regulators. Nature, 403(6767):335–338, Jan. 2000. 53 University of Cologne Biological Physics II [16] T. Gardner, C. Cantor, and J. Collins. Construction of a genetic toggle switch in Escherichia coli. Nature, 403(6767):339–342, Jan. 2000. [17] D. T. Gillespie. A general method for numerically simulating the stochastic time evolution of coupled chemical reactions. Journal of Computational Physics, 22(4):403–434, 1976. ISBN: 0021-9991. [18] D. T. Gillespie. Markov Processes: An Introduction for Physical Scientists. Academic Pr Inc, 1991. [19] I. Golding, J. Paulsson, S. M. Zawilski, and E. C. Cox. Real-time kinetics of gene activity in individual bacteria. Cell, 123(6):1025–1036, Dec. 2005. ISBN: 0092-8674. [20] H. Haken. At least one lyapunov exponent vanishes if the trajectory of an attractor does not contain a fixed point. Physics Letters A, 94(2):71–72, 1983. [21] H. Kantz and T. Schreiber. Nonlinear Time Series Analysis. University Press, Cambridge, UK, 2nd edition, 2003. Cambridge [22] A. J. Koch and H. Meinhardt. Biological Pattern Formation: form Basic Mechanisms to Complex Structures. Reviews of Modern Physics, 66(4):1481– 1508, 1994. [23] G.-W. Li and X. S. Xie. Central dogma at the single-molecule level in living cells. Nature, 475(7356):308–15, July 2011. ISBN: 1476-4687 (Electronic)\n0028-0836 (Linking). [24] K. Mitosch, G. Rieckh, and T. Bollenbach. Noisy Response to Antibiotic Stress Predicts Subsequent Single-Cell Survival in an Acidic Environment. Cell systems, 4(4):393–403.e5, Apr. 2017. Publisher: Elsevier Inc. [25] J. R. Moffitt and C. Bustamante. Extracting signal from noise: kinetic mechanisms from a Michaelis-Menten-like expression for enzymatic fluctuations. The FEBS journal, 281(2):498–517, Jan. 2014. [26] P. Nelson. Physical Models of Living Systems. W. H. Freeman, 2015. [27] T. M. Norman, N. D. Lord, J. Paulsson, and R. Losick. Memory and modularity in cell-fate decision making. Nature, 503(7477):481–6, Nov. 2013. Publisher: Nature Publishing Group ISBN: 0028-0836. [28] R. Phillips, J. Kondev, J. Theriot, and H. Garcia. Physical Biology of the Cell. Taylor & Francis, 2012. [29] T. Sauer, J. Yorke, and M. Casdagli. Embedology. J. Stat. Phys., 65:579–616, 1991. [30] V. M. Savage, J. F. Gillooly, W. H. Woodruff, G. B. West, A. P. Allen, B. J. Enquist, and J. H. Brown. The predominance of quarter-power scaling in biology. Functional Ecology, 18(2):257–282, 2004. ISBN: 0269-8463. 54 University of Cologne Biological Physics II [31] M. Scott, C. W. Gunderson, E. M. Mateescu, Z. Zhang, and T. Hwa. Interdependence of cell growth and gene expression: origins and consequences. Science (New York, N.Y.), 330(6007):1099–102, Nov. 2010. ISBN: 1095-9203. [32] A. J. Spence. Scaling in biology. Current biology : CB, 19(2):R57–61, Jan. 2009. [33] S. H. Strogatz. Nonlinear Dynamics and Chaos. Westview Press, 2014. [34] Y. Taniguchi, P. J. Choi, G.-W. Li, H. Chen, M. Babu, J. Hearn, A. Emili, and X. S. Xie. Quantifying E. coli proteome and transcriptome with singlemolecule sensitivity in single cells. Science (New York, N.Y.), 329(5991):533–8, July 2010. [35] A. M. Turing. The chemical basis of morphogenesis. Philosophical Transactions of the Royal Society B: Biological Sciences, 237:37–72, 1952. [36] N. van Kampen. Stochastic Processes in Physics and Chemistry. North Holland, 2007. [37] E. van Nimwegen. Scaling laws in the functional content of genomes. Trends in genetics : TIG, 19(9):479–84, Sept. 2003. ISBN: 0168-9525. [38] M. Wallden, D. Fange, E. G. Lundius, �. Baltekin, and J. Elf. The Synchronization of Replication and Division Cycles in Individual E. coli Cells. Cell, 166(3):729–39, July 2016. [39] L. Wolpert. Positional information and the spatial pattern of cellular differentiation. Journal of theoretical biology, 25(1):1–47, Oct. 1969. [40] M. Zagorski, Y. Tabata, N. Brandenberg, M. P. Lutolf, G. Tkačik, T. Bollenbach, J. Briscoe, and A. Kicheva. Decoding of position in the developing neural tube from antiparallel morphogen gradients. Science (New York, N.Y.), 356(6345):1379–1383, June 2017. T. Bollenbach and G. Ansmann, July 5, 2023 55