14th IFAC (International Federation of Automatic Control) Symposium on System Identification, SYSID 2006, March 29-31 IMPACT OF SYSTEM IDENTIFICATION METHODS IN METABOLIC MODELLING AND CONTROL Dr. J. Geoffrey Chase Department of Mechanical Engineering Centre for Bio-Engineering University of Canterbury Christchurch, New Zealand The Situation Metabolic modelling can significantly improve the clinical control of hyperglycaemia with model-based protocols (e.g. Hovorka et al., 2004; Chase et al., 2005) For clinical utility, model parameters must be accurately identified for real-time prediction of response to intervention Current identification methods are mostly non-linear and nonconvex, and very computationally intense With increasing model complexity, parameter trade-off can result in problematic identification. A typical solution is probabilistic population fitting methods (e.g. Vicini and Cobelli, 2001; Hovorka et al., 2004) Typical clinical situation might use models and identification methods from different sources with local cohort/data. The Problem & The Goal Non-linear and non-convex identification methods and models can deliver sub-optimal results, affecting control prediction – Clinically, prediction is the only true measure of utility What is the clinical impact of mixing models and identification methods (if any)? – Currently, model, system ID method and control are all designed together. – What happens if someone “mix and matches” without the original designers insights or experience? This research compares a recently introduced linear, convex integralbased method and the commonly used non-linear recursive least squares (NRLS) identification method – Using an accepted metabolic system model from one source and clinical data from another source for “independence” – “Independence” represents the typical clinical situation and avoids the models or methods being tuned for the cohort The goal is to examine the computational cost and outcomes of these different methods in a clinical control application context Model The model chosen for comparison is loosely based on the 2compt. minimal model (2CMM) first proposed by Caumo & Cobelli (1993) – Well documented model that is widely used as a foundation Main change is the 3 insulin compartments for the remote effects of insulin on glucose distribution/transport, disposal and EGP introduced by Hovorka et al. (2002) – Similar model has been used clinically for control Comprises 6 compartments in total – 2 glucose compartments g1(t) and g2(t) – 3 insulin action compartments QD(t), QT(t) and QEGP(t) – 1 plasma insulin compartment I(t) ) Integral-Based Parameter Fitting A “minimal” approach to identification is used with most model constants identified a priori from literature results – Selection of population valued constants is a major issue in biomedical modeling as it assumes the parameter is not highly sensitive to results – This assumption may not be true in all clinical scenarios or cohorts – Required in many cases to ensure the model is identifiable from the available data The remaining insulin sensitivities SI,D, SI,T and SI,EGP are identified as time-varying model parameters driving the model dynamics (details in the paper) This approach minimises total computational cost while enabling individual model constants to be varied for more optimised prediction and fit (e.g. Hann et al., 2005) What is the effect of mixing this approach and this model? – Would be an “easy” combination for an independent researcher – Will all assumptions on constant parameters hold? – Can we identify despite inaccessible, unmeasurable compartments? Integral-Based Parameter Fitting SI,D, SI,T and SI,EGP are defined piecewise constant over a time period of 60mins using Heaviside step functions, H(t). N S I , j S I , j ,i ( H (t t( i 1) ) H (t ti )) where j D, T and EGP i 1 Definition of the distribution of these parameters are arbitrary i.e. cubic, quadratic etc. – Approach allows constants to define variation and be pulled out of integrals 2nd order polynomial interpolation is assumed between glucose measurements in the accessible glucose compartment g1(t) – Error using this approximation has been shown to be minimal (Hann et al., 2005) Integral-Based Parameter Fitting Inaccessible glucose compartment g2(t) modelled using a 2nd order Lagrange polynomial approximation to analytical solution for this immeasurable compartment (fortunately, it’s a simple enough dynamic) Within a time period of [t0 tf ], an arbitrary number of equations can be generated by integration of model equations over different time periods The non-linear model thus decomposes into a linear equation system in unknown constants defining parameters to be identified – Resulting least squares solution is starting point independent and convex! A S I ,T g2 (t0 ) g2 (t1 ) g2 (t f ) S I , EGP EGPb b T C S I ,D d Clinical Data Patient data (n=7) was chosen from an intensive care unit hyperglycaemia control trial (Chase et al., 2005) Each set of patient data spans 10hrs with glucose measurements at 0.5hr intervals. – Average glucose levels are ~ 6mmol/L (range ~4-10 mmol/) Prediction window is 1hr following hourly clinical interventions Median APACHE II = 23, inter-quartile range = 19-25 Results: Model Fit Residual plot of model fit to patient data 2 1.5 Model fit errors – Patient 2 (highest RMSE 1 0.80mmol/l, error SD 0.5 0.59mmo/l) 0 – Patient 5 (smallest RMSE -0.5 0.15mmol/l, error SD -1 0.08mmol/l) -1.5 Patient 2 Patient 5 -2 -2.5 0 Patient 1,3,4,6,7 100 200 300 400 500 600 Model fit mean absolute percent error (MAPE) for cohort ranges from 2.4-7.4% which is within reported sensor error Results: Prediction Residual plot of model prediction to patient data 4 3 – MAE for cohort is 2 1.03mmol/l, error SD is 1 0.78mmol/l 0 – RMSE is 1.31mmol/l, MAPE -1 20.21% -2 – Very variable depending on Patient 2 Patient 5 Patients 1,3,4,6,7 -3 -4 100 Model prediction errors 150 200 250 300 350 400 450 500 550 the patient and/or time 600 Prediction MAPE exceeds the reported sensor error Errors are mostly at or within sensor error or very wide Results: NRLS Average model fit RMSE for NRLS and integral-based methods 0.7 Nonlinear Parameter ID Integral-Based Parameter ID 0.6 NRLS implemented using a non-linear ODE least squares solver in MATLAB on a Pentium M 1.7GHz PC, 1Gb RAM 0.5 0.4 Integral method has lower error even with approximated compartment 0.3 0.2 0.1 0 0 100 200 300 400 500 600 Average values of SI,D, SI,T and SI,EGP from literature used as starting points Integral-based method with linear approximation of g2(t) is 140X-660X faster than NRLS NRLS finds local minima as seen in higher average model fit RMSE at most times Average time to complete model fit for one 10hr trial using linear integral-based method was 0.46±0.16s vs 122.60±42.81s using NRLS Is it the model or method? Average model prediction RMSE with 1-compt. glucose model 1.8 1.6 2-compartment glucose model Chase et al. model (1-compartment glucose model) 1.4 1.2 1 0.8 (Chase et al., 2005) Care must be taken not to over fit available data with model dynamics. For this cohort, the 1-compt. glucose model has significantly smaller prediction errors for a given set of parameters This result is due to differences in model dynamics and ability to fit the observed behaviour, independent of fitting method 0.6 However, model constants were average a priori values and not further optimised Hence the level of prediction accuracy reported may be expected 0.4 0.2 100 150 200 250 300 350 400 450 500 550 600 A convex identification method exposes the model prediction errors, identifying potential inadequacies in model dynamics and/or constants Some Conclusions Cohort model fit RMSE and MAPE were lower using linear integral-based method compared to NRLS – for the same model Model complexity can be extended (i.e. multiple compartments) without significantly affecting identification computation time Integrals can be used for simple inaccessible compartments using approximations Fitted parameters were all within reported physiological ranges Issues: – – Different model dynamics and parameters may work better for different cohorts or situations – the comparison is not “complete” and this work is presented to show the potential impacts A priori global identifiability should always be considered. Not all models are globally identifiable for all parameters. Linear, integral-based method shown to have lower computational cost leading to increased PI speed A convex method can identify potential areas of model difficulty or which other parameters may need to be identified in place of a population value. Acknowledgements Jessica Lin & AIC3 AIC1 Jason Wong & AIC4 Thomas Lotz The Danes Prof Steen Andreassen AIC2 & Dr. Geoff Shaw Dunedin Dr Kirsten McAuley Maths and Stats Gurus Dr Dom Lee Dr Bob Broughton Prof Graeme Wake Dr Chris Hann Prof Jim Mann