An Iterative Parameter Estimation Method for Biological Systems Xian Yang, Yike Guo, Jeremy Bradley Overview • • • • • • Background Problem Formulation Parameter Estimation Methods Experiments and Results Discussion Conclusion Background • • • • To interpret the time evolution of biological systems, scientists build up models, which are their simplified mathematical representations. These models are essentially a set of equations whose solution describes the evolution, as a function of time, of the state of the system. They include the following ingredients: • A phase space S • Time t • An evolution law Example: • dx/dt = X(x) • xt+1 = X(xt) Background • • • • Mechanistic models of pathways describe the time evolution of molecules and give a detailed insight into pathway dynamics. Some parameters, such as kinetic rates and initial concentrations, cannot be measured directly by biological experiments. It is necessary to estimate unknown parameters from the observed system dynamics. Given the specified structure of pathway and ranges of parameter values to be explored, parameter estimation methods search the parameter space to generate parameters for which the simulated model can exhibit the desired behaviour. Overview • • • • • • Background Problem Formulation Parameter Estimation Methods Experiments and Results Discussion Conclusion Problem Formulation • The biological process modelled by a system of differential equations is of the form: where contains the unknown parameters that we seek to estimate, represents the concentration levels of M species involved in the system, such as proteins and mRNA, at time t, and shows the initial concentration levels of these molecules. Problem Formulation • We represent quantities, , that can be measured experimentally as: where h is the output function that relates the measurement with system state. Usually, can only be measured at discrete time instant where and corrupted with random noise . Problem Formulation • A simple example: • Model equations: Where Overview • • • • • • Background Problem Formulation Parameter Estimation Methods • Other methods • Bayesian methods: ABC SMC • Integration of ABC SMC and windowing method Experiments and Results Discussion Conclusion Parameter Estimation methods – Other methods • Parameter inference methods combine experimental data with the knowledge about the underlying structure of a dynamical system to be investigated. • Optimization methods: • Formulate the parameter estimation problem as a nonlinear optimization problem. Its objective function is the difference between model prediction and the experimental data. • Kalman filters: • Parameter estimation is handled in the framework of control theory by using state observations. • They are recursive estimator that incorporates new information from experimental data. Parameter Estimation methods – Bayesian Methods • Advantage: • They can infer the whole probability distributions of the parameters, rather than just a point estimate. • Bayes’ rule for parameter estimation: • Suppose we get the measurable quantities , which reflect the system state , then the posterior distribution of parameters is: that is, Parameter Estimation methods – Bayesian Methods • Approximate Bayesian computation (ABC): • For complex models, computing the likelihood is time consuming and intractable. • ABC method avoid explicit evaluation of the likelihood by considering distances between observvation and data simulated from a model with parameter . • The generic ABC approach to infer the posterior probability distribution of is: Parameter Estimation methods – Bayesian Methods θ2 X(t) θ1 t Parameter Estimation methods – Bayesian Methods θ2 X(t) θ1 t Parameter Estimation methods – Bayesian Methods θ2 X(t) θ1 t Parameter Estimation methods – Bayesian Methods θ2 X(t) θ1 t Parameter Estimation methods – Bayesian Methods θ2 X(t) θ1 t Parameter Estimation methods – Bayesian Methods θ2 X(t) θ1 t Parameter Estimation methods – Bayesian Methods • Weakness of the ABC rejection sampler: • High rejection rate • ABC Sequential Monte Carlo (SMC) method: • The ABC SMC method is used to obtain posterior distribution via intermediate distributions , where and P is the total number of intermediate distributions. It should satisfy the condition that ϵp+1< ϵp for all p. • Note that the first step of the ABC SMC method is the ABC rejection algorithm. • For the special case, when the prior distribution of parameter and the perturbation kernel are uniform, the sampling weight becomes 1/N. Parameter Estimation methods – Bayesian Methods θ2 Population 1 ϵ1 θ2 Population 2 ϵ2 θ2 Population 3 ϵ3 Perturbing Random sampling θ1 θ1 θ1 Parameter Estimation methods – Integration of ABC SMC and windowing method • • When the prior knowledge of parameters is poor and the parameter space to be explored is therefore large, the first run of the ABC SMC method, which corresponds to the ABC rejection algorithm, will take a long time to find the first intermediate distribution. To solve this problem, we introduce a windowing method. Parameter Estimation methods – Integration of ABC SMC and windowing method • Assume two parameters, θ(1) and θ(2), are to be estimated. The prior distribution of these two parameters are Pr(θ(1))=U(ϴ1, ϴ2) and Pr(θ(2))=U(ϴ3, ϴ4) the error allowance is ϵ. Parameter Estimation methods – Integration of ABC SMC and windowing method • The parameter estimation method proposed in this paper is the combination of a windowing method and ABC SMC. • The windowing method is applied first to reduce the parameter space to be explored by ABC SMC. The process of this integrated method is as follows: 1. Using the windowing method to get a better prior knowledge of parameters. 2. By running the ABC SMC scheme, the target posterior distribution of parameters is approached by a set of intermediate distributions. Parameter Estimation methods – Integration of ABC SMC and windowing method • • • In this paper, the prior distribution of parameters is set to be uniform, and the perturbation kernel in SMC is also uniform. Therefore, the sampling weight equals to 1/N for all p and n. The ABC SMC method can be improved if we adjust the weight value corresponding to the distance. Overview • • • • • • Background Problem Formulation Parameter Estimation Methods Experiments and Results • Model of a biological system • Results of the windowing method • Results of the ABC SMC method using equal sampling weight • Results of the ABC SMC method using adaptive sampling weight Discussion Conclusion Experiments and Results – Model of a biological system • • In this paper, we infer the parameters of a standard repressilator model. This model consists of six differential equations that represent the concentration levels of three mRNA (where i ϵ {lacl, tetR, cl}) and three repressor proteins (where j ϵ {cl, lacl, tetR}). The mathematical model that represents this system is: where Experiments and Results – Settings of experiment • Settings of experiment: • Suppose we are able to measure the concentration level of mRNA at discrete time instants . • The noise in the measurement is assumed to give a standard derivation of 5% of the mean of the signal. • The initial concentrations of these six species are: • The model is simulated by MATLAB’s ODE45 solver. • In the simulation, the parameters that generate observations are: • The distance between the noisy observation and the solution of the system for is: • The prior distribution of each parameter is assumed to be uniform that: Experiments and Results – Settings of experiment The oscillatory dynamics of the repressilator system with original experimental settings. measured mlacl 80 measured mtetR measured mcl 70 mlacl 60 Concentrations (arbitrary units) • mtetR mcl 50 40 30 20 10 0 -10 0 5 10 15 Time(min) 20 25 30 Experiments and Results – Results of the windowing method 1. Results of the windowing method: • Partition the whole large space into small regions whose central points are: • We set the error threshold to be 35000 initially which is much larger than the desired value. • Use the performance of the central point of each region to represent the whole region. • 62 parameter points are selected (acceptance rate is 0.22%) with the distance value smaller than 35000. • The ranges of each parameter after windowing method are : • After using the windowing method, the prior distributions of parameters which are the inputs of the ABC SMC method are: . Experiments and Results – Results of the ABC SMC method using equal sampling weight 2. Results of the ABC SMC method using equal sampling weight • The error threshold is set successively to be • The number of accepted parameter sets equals 1000. • The first run of the ABC SMC method is an ABC rejection sampling scheme with the error threshold ϵ0=5000. • In order to get 1000 accepted parameter sets, 148720 parameter sets are randomly sampled from prior distributions. Therefore, the acceptance rate is 200 200 1000/148720=0.67% 150 150 100 100 50 50 0 0 0.5 1 1.5 2 0 1.5 2 n 0 150 2.5 200 150 100 100 50 0 50 3 4 5 6 0 500 1000 1500 2000 2500 . Experiments and Results – Results of the ABC SMC method using equal sampling weight 2. Results of the ABC SMC method using equal sampling weight • In the second iteration of the ABC SMC method, the parameter set is sampled from the previous distribution {θ01 , θ02 ,…, θ01000 } with weight 1/1000. • Then the sampled parameter set is perturbed with the uniform function Kt=σU(-1,1), where σ =0.1 for α0,n,β and σ =0.5 for α. • The perturbed parameter set is accepted if its distance value is less than 1000. • To get 1000 parameter sets accepted, 62216 parameter sets are checked, 200 200 where the acceptance rate is 1000/62216=1.6%. 150 150 100 100 50 50 0 0 0.5 1 1.5 2 0 1.5 2 n 0 200 200 150 150 100 100 50 50 0 3 4 5 6 0 500 1000 1500 2.5 2000 2500 . Experiments and Results – Results of the ABC SMC method using equal sampling weight 2. Results of the ABC SMC method using equal sampling weight • In the third iteration of the ABC SMC method, same perturbation function is used. • The perturbed parameter set is accepted if its distance value is less than 500. • The total number of parameter sets sampled from previous distribution is 43731 with an acceptance rate of 1000/43731=2.3%. 200 200 150 150 100 100 50 50 0 0 0.5 1 1.5 2 0 1.5 2 n 0 200 200 150 150 100 100 50 50 0 3 4 5 6 0 500 1000 1500 2.5 2000 2500 . Experiments and Results – Results of the ABC SMC method using equal sampling weight 2. Results of the ABC SMC method using equal sampling weight • In the fourth iteration of the ABC SMC method, same perturbation function is used. • The perturbed parameter set is accepted if its distance value is less than 300. • The total number of parameter sets sampled from previous distribution is 71594 with an acceptance rate of 1000/71594=1.4%. 200 200 150 150 100 100 50 50 0 0 0.5 1 1.5 2 0 1.5 2 n 0 200 200 150 150 100 100 50 50 0 3 4 5 6 0 500 1000 1500 2.5 2000 2500 . Experiments and Results – Results of the ABC SMC method using equal sampling weight 2. Results of the ABC SMC method using equal sampling weight • In the fifth iteration of the ABC SMC method, same perturbation function is used. • The perturbed parameter set is accepted if its distance value is less than 150. • The total number of parameter sets sampled from previous distribution is 582685 with an acceptance rate of 1000/ 582685 =0.17%. 200 200 150 150 100 100 50 50 0 0 0.5 1 1.5 2 0 1.5 2 n 0 200 200 150 150 100 100 50 50 0 3 4 5 6 0 500 1000 1500 2.5 2000 2500 . Experiments and Results – Results of the ABC SMC method using equal sampling weight 2. Results of the ABC SMC method using equal sampling weight Table 1. The number of data generation steps needed to accept 1000 parameter sets to generate each intermediate distribution and its corresponding acceptance rate. ϵ Data generation steps Acceptance probability 5000 148720 0.67% 1000 62216 1.6% 500 43731 2.3% 300 71594 1.4% 150 582685 0.17% Total number of data generation steps = 908946 . Experiments and Results – Results of the ABC SMC method using equal sampling weight 2. Results of the ABC SMC method using equal sampling weight Table 2. The mean and variance of each estimated parameter. Mean value Variance α0 1.128 0.001 n 2.0893 0.0008 β 4.89 0.073 α 1011 1625 Experiments and Results – Results of the ABC SMC method using adaptive sampling weight 2. Results of the ABC SMC method using equal sampling weight Error Error Error Error Error 2.6 2.4 Allowance Allowance Allowance Allowance Allowance is is is is is 5000 1000 500 300 150 6.5 6 5.5 2.2 n 5 4.5 2 4 1.8 3.5 0 0.5 1 1.5 3 2 2500 2500 2000 2000 1500 1500 1000 500 0 0.5 1 0 1.6 1.5 2 0 1000 0 0.5 1 0 1.5 2 500 3 3.5 4 4.5 The outputs of ABC SMC as two-dimensional scatter plots. 5 5.5 6 6.5 . Experiments and Results – Results of the ABC SMC method using adaptive sampling weight 3. Results of the ABC SMC method using adaptive sampling weight Table 3. The number of data generation steps needed to accept 1000 parameter sets to generate each intermediate distribution and its corresponding acceptance rate using adaptive weight. ϵ 5000 1000 500 300 150 Data generation steps Acceptance probability 148720 0.67% 33956 2.94% 29091 3.44% 48570 2.06% 446048 0.22% Total number of data generation steps = 706385 Experiments and Results – Results of the ABC SMC method using adaptive sampling weight 3. Results of the ABC SMC method using adaptive sampling weight Error Error Error Error Error 2.6 2.4 Allowance Allowance Allowance Allowance Allowance is is is is is 5000 1000 500 300 150 6.5 6 5.5 2.2 n 5 4.5 2 4 1.8 3.5 0 0.5 1 1.5 3 2 2500 2500 2000 2000 1500 1500 1000 500 0 0.5 1 0 1.6 1.5 2 0 1000 0 0.5 1 0 1.5 2 500 3 3.5 4 4.5 5 5.5 6 6.5 The outputs of ABC SMC using adaptive sampling weight as two-dimensional scatter plots. Overview • • • • • • Background Problem Formulation Parameter Estimation Methods Experiments and Results Discussion Conclusion Discussion • • • • • • We would like to compare our proposed parameter estimation method with the methods proposed in [1]. It makes use of the probabilistic information in the measurement noise and let parameter estimation be a nonlinear optimization problem. The Broyden-Fletcher-Goldfarb-Shanno (BFGS) method is used in [1] to search the desired parameters. As with other optimization methods, BFGS can only return a single parameter set rather than the posterior distributions of parameters. Moreover, we find that the BFGS method may sometimes return a parameter set whose value is far from the real one but its simulated data is almost consistent with observations. In our example, the BFGS based parameter estimation method predicts parameters to have the following values: [1] Lillacci, G. and Khammash, M. 2010. Parameter estimation and model selection in computational biology. PLoS computational biology, 6, 3 (March 2010), e1000696. Discussion 80 Real mlacl Real mtetR Real mcl 70 Simulated m lacl with 0 Simulated m tetR with 0 Simulated m cl with 0 60 Concentrations 50 40 30 20 10 0 0 5 10 15 Time(min) 20 25 Figure. Compare the real dynamics of repressilator system with the simulated dynamics using parameter set . 30 Overview • • • • • • Background Problem Formulation Parameter Estimation Methods Experiments and Results Discussion Conclusion Conclusion • • • • This paper develops an iterative parameter estimation method to efficiently infer parameters of biological systems with a known mathematical model, experimental measurements and ranges of parameters. Parameters in the mechanistic model of a repressilator system are predicted by the proposed estimation method. In order to increase the efficiency of the ABC SMC method, a windowing technique is introduced to reduce the size of parameter space to be explored. Moreover, the ABC SMC method in this paper uses an adaptive sampling weight which potentially reduces the number of data generation steps. Questions?