Intrinsic Noise in Nonlinear Gene Regulation Inference 1 Du Chao & Wing H. 2 Wong 1 Department of Statistics, University of Virginia. 2 Department of Statistics, Stanford University. Intrinsic Noise: Modeling and Inference Transcriptional Regulation: An Overview • The abundance of each species (RNA or protein) is treated as discrete random variable that represents the copy number of molecules. Regulation Transcription Gene 1 RNA 1 Translation Translation Transcription Gene n RNA n Protein 2 ∅ Translation Protein n Degradation Degradation ∅ RN A xi i ∅ Degradation Degradation ∅ ri x i Production ⌧i Fi (X) X = {X1 , X2 , · · · , XM } P rotein yi i ∅ • The properties of gene regulation system are determined by the rates of transcription, translation and degradation, which are determined by the abundance of RNAs and proteins. • The key question is how to infer the regulation relationships between genes, based on the expression data. Gene i X X0 0 W (X ! X )P (X) + Pmi bi0 + j=1 bij Sij Pmi Fi (X) = 1 + j=1 cij Sij Sij = Y n bij =? cij =? Xk kij k ∅ X0 W (X 0 ! X)P (X 0 ) W (X ! X 0 ) X0 i Xi {X1 , · · · , Xi + 1, · · · , XM } {X1 , · · · , Xi , · · · , XM } {X1 , · · · , Xi At the steady state, we can utilize the moments generating function of X to construct a series of linear functions of unknown parameters. =⌧i e i h 0 ⇥ bi0 E e k Xk P k ⇤ + mi X k Xk ⇥ cij E gnij (Xi )Sij Xi mi X + ⇤ H(X))W (X ! X ) = 0 j=1 ⇤ i Xi 0 E (H(X ) h ⇥ P i E Xi e k ⌧i F (X) 1, · · · , XM } ⇥ dP (X) =0 dt Solution: Problem of Interests: X X ∅ How to infer the unknown parameters and terms in the rational function based on the observed distribution of X? • Modern single-cell level experiment allows us to study the gene regulation system as a stochastic system. dP (X) = dt Xi Degradation ∅ Protein i Degradation RNA 2 i fi (y) yi Translation RNA i Degradation Gene 2 Transcription Degradation Degradation ∅ Transcription Protein 1 xi • The system can be modeled by a multivariate birth-death process whose evolution is described by Kolmogorov forward equation (master equation) j=1 ⇥ bij E Sij e P k nij k Xk e P ⇤i k k Xk ⇤i rk 2 R [1] Procedure of Inference: 1. Propose candidate model (rational function) How to Account for the Physical Nature of Cellular System? 2. Collect single-cell expression data from different steady states (via perturbation experiments) 3. Estimate the moments of gene expression at steady states, construct linear functions of unknown parameters. 4. Estimate unknown parameters using convex optimization. 1. Quasi-Steady-State Assumption Production • As RNA metabolism is often much slower (or faster) than protein metabolism, we may assume that the concentrations of protein (or RNA) species always reach “quasi-steady state” instantly. Gene i ⌧i fi (x) Degradation • This assumption allows us to simplify the model to only include RNA (or protein) species. Example: Genetic Toggle Switch i xi • Genetic toggle switch refers to a bistable gene-regulation module. ∅ • It exists in natural gene circuit and has been constructed artificially in the lab [3]. • Can be formed by two repressor genes. Pmi 2. Production Rates as a Nonlinear Rational Function bi0 + j=1 bij Sij P mi ⌧i fi (x) = ⌧i 1 + j=1 cij Sij • The production rates are determined by the concentrations of different transcription factors, which are protein molecules or complexes formed by several protein molecules. 0 bi0 1 n xk kij F1 (X1 , X2 ) 0.5 = 1 + X22 0 bij cij k Gene 1 0.01X1 Production Gene 2 F2 (X1 , X2 ) 0.5 = 1 + X12 Degradation • By the law of thermodynamic, the production rates can be modeled as a nonlinear rational function [2]: Production ∅ Example: 0.5 + x1 + 0.01x2 x3 f1 (x) = 1 + x1 + 0.9x22 + 0.2x2 x3 x1 x1 0.5 1 0 0.9x22 0.01x2 x3 0.2x2 x3 3. Construct linear functions X2 X1 Degradation Sij = Y Example: [4] Unperturbed Steady-State ( 1, Perturbed Steady-State ( 2) 1, 2 [0, 1] ⇥ [0, 1] 2) 2 [0, 2] ⇥ 0 0.01X2 b10 b11 b12 ∅ ⇥ 1. Candidate model b13 b14 b15 c11 c12 c13 Gene 3 NA P1 NA Activator P2 P2 P2 P3 Repressor Repressor F1 (X1 , X2 ) = Gene 1 Gene 2 c14 b10 + b11 X1 + b12 X2 + b13 X12 + b14 X1 X2 + b15 X22 1 + c11 X1 + c12 X2 + c13 X12 + c14 X1 X2 + c15 X22 c15 4. Estimate parameters 2. Collect sample from two steady states 3. Intrinsic Noise Unperturbed Steady-State • In a typical ensemble experiment, measured gene expression represents the ensemble average of huge number of cells. Modern technique allows us to observe gene expression at single cell level, which reveals intrinsic noise — stochastic cell-to-cell variation. • The intrinsic noise, or the observed distribution of gene expression at single cell level, contains valuable information regarding the underlying regulation relationship. X1 X1 X2 X2 X̄ X3 X4 … X1 … Ensemble Average Measurements X3 X4 .. . Perturbed Steady-State X2 X3 X4 … … (X1 , X2 ) (X1 , X2 = C) Single-Cell Level Measurements [1] Source of background image: http://cronodon.com/images/DNA4.jpg. [2] Bintu, L., Buchler, N. E., Garcia, H. G., Gerland, U., Hwa, T., Kondev, J., and Phillips, R. (2005). Transcriptional regulation by the numbers: models. Current Opinion in Genetics & Development 15: 116–124. (2000 samples per steady-state) [3] Gardner, T., Cantor, C., and Collins, J. (2000). Construction of a genetic toggle switch in Escherichia coli. Nature 403: 339–342. [4] Source of Image http://openwetware.org/images/4/4a/Geneswitch.png.