Data driven process understanding Gunnar Malmquist GE Healthcare Uppsala Sweden EPSRC Centre User Group Meeting 14 April 2015 Imagination at work. “Change is one thing, progress is another.” Bertrand Russell Outline Introduction Empirical modeling of unit op’s Beyond unit op’s The need for mechanistic modeling The desired destination Design Control Process Analysis Understanding Gunnar Malmquist, GE Healthcare, Uppsala, Sweden EPSRC Centre User Group Meeting 14 April 2015 3 Today’s approaches are not prepared for onslaught of Industrial Big Data Too slow Too expensive Too rigid 80% of an analytics project typically involves gathering and then preparing the data for analysis* *Source: IDC Gunnar Malmquist, GE Healthcare, Uppsala, Sweden EPSRC Centre User Group Meeting 14 April 2015 Data prep pain in social media ”In Data Science, 80% of the time spent preparing data, 20% of the time spent complaining about need for preparing data” BigDataBorat via (dkalab.tumblr.com) Gunnar Malmquist, GE Healthcare, Uppsala, Sweden EPSRC Centre User Group Meeting 14 April 2015 5 Imagine instant access… E.g. Work Orders Stock transactions Recipes Batch/EBR Materials Mgmt Powerful Analytics Visualization Tools Process Data Control Asset Data Consumables Traceability Sensor Calibration Raw Material Datasets Probes Transmitters BioPharma Feedstream Data Incoming CQA The art of Context Gunnar Malmquist, GE Healthcare, Uppsala, Sweden EPSRC Centre User Group Meeting 14 April 2015 Databases Process Historians LIMS (QC Data) Batch Records 6 Unit operation software concept Process Equipment agnostic Cell culture data Chromatograms Subsystem Parameter coding Analytics CLOUD OR subsystem ON PREMISE Parameter coding GE Raw material Subsystem Gunnar Malmquist, GE Healthcare, Uppsala, Sweden EPSRC Centre User Group Meeting 14 April 2015 Offline data (IPC etc) Process data example: 52 runs from PrA CIP study Gunnar Malmquist, GE Healthcare, Uppsala, Sweden EPSRC Centre User Group Meeting 14 April 2015 8 First approach: Feature extraction ”Chemometrics for Characterization, Classification and Prediction in Chromatography” Gunnar Malmquist Ph. D. Thesis 1993 Gunnar Malmquist, GE Healthcare, Uppsala, Sweden EPSRC Centre User Group Meeting 14 April 2015 9 PCA of extracted features reveals grouping 25 24 23 30 6 10 8 7 17 11 12 3 5 13 18 20 14 22 26 37 21 27 31 28 3435 29 32 33 36 47 51 38 48 52 15 50 9 41 40 4 1 2 Gunnar Malmquist, GE Healthcare, Uppsala, Sweden EPSRC Centre User Group Meeting 14 April 2015 46 42 44 19 16 43 49 45 39 10 Second approach: Use aligned profiles Gunnar Malmquist, GE Healthcare, Uppsala, Sweden EPSRC Centre User Group Meeting 14 April 2015 11 PCA on elution phase shows groups w run order Run# 22 24 21 23 32 27 25 29 28 36 35 26 31 7 12 6 5 3 11 48 46 37 19 33 2 16 51 47 52 50 18 1 4 40 39 20 15 14 17 10 38 9 30 8 34 49 45 13 44 42 41 Outlier run 43 excluded Gunnar Malmquist, GE Healthcare, Uppsala, Sweden EPSRC Centre User Group Meeting 14 April 2015 12 Holistic process monitoring Process overview HIC step Out of trend Normal Multivariate out of trend detection Drill down to outliers Gunnar Malmquist, GE Healthcare, Uppsala, Sweden EPSRC Centre User Group Meeting 14 April 2015 13 Drill down to process data (e.g. chromatograms) Process overview HIC step Out of trend Normal Gunnar Malmquist, GE Healthcare, Uppsala, Sweden EPSRC Centre User Group Meeting 14 April 2015 14 Drill down to raw material data (e.g. Resin data) Process overview HIC step Raw material overview Out of trend Normal Gunnar Malmquist, GE Healthcare, Uppsala, Sweden EPSRC Centre User Group Meeting 14 April 2015 15 Operational complexity Across drugs Across sites Across scales Process wide Unit op analytics MFG network wide analytics MFG platform wide analytics PD to MFG Wing to Wing analytics Need for ”Big Data” tools Need for mechanistic modeling Gunnar Malmquist, GE Healthcare, Uppsala, Sweden EPSRC Centre User Group Meeting 14 April 2015 16 Where do we need to go next? Empirical models (MVDA) and machine learning algorithms are powerful tools for process understanding Domain specific mechanistic modeling adds value • Transfer of knowledge across scales (PD ↔ MFG) • Effect of changes in particle size or ligand density etc • Extra-column effects in chromatography • Scale-up effects in cell culture • Identify data transforms driven by domain expertise • Increases ability to extrapolate • May increase ability to identify adaptive strategy Gunnar Malmquist, GE Healthcare, Uppsala, Sweden EPSRC Centre User Group Meeting 14 April 2015 17 The concept is well known but still very interesting! Gunnar Malmquist, GE Healthcare, Uppsala, Sweden EPSRC Centre User Group Meeting 14 April 2015 Hybrid modeling challenges Parameter estimation • Can mechanistic model parameters be estimated from a large set of process data directly? • Extract model parameters from process characterisation? • Are there simple scale down experiments that provide additional value for mechanistic modeling? Modeling with uncertainty • Can mechanistic models provide value when parameter estimation is less than perfect? • Add value even if unknowns are present? An area well suited for a research project ? Gunnar Malmquist, GE Healthcare, Uppsala, Sweden EPSRC Centre User Group Meeting 14 April 2015 19 20