A bried overview on the assimilation of neural network based solutions by the (bio)pharmaceutical industry João Almeida Lopes Faculdade de Farmácia da Universidade de Lisboa ISEP, Porto - 2015.10.01 Summary Part [i] Some concepts around the Artificial Neural Networks theme Part [ii] Applications of Artificial Neural Networks in the pharmaceutical field (R&D) Part [iii] Illustration of the use of Artificial Neural Networks in three situations (pharmaceutical processes) Machine Learning Machine learning deals with the construction and study of systems that can learn from data. A supervised learning algorithm analyzes the training data and produces and inferred function, which can be used for mapping new examples. In unsupervized learning, no labels are given to the learning algorithm, leaving it on its own to group similar inputs, also called clustering. Purposes: • association • classification (clustering) • transformation (different representation) • modelling Artifical Neural Networks are Machine Learning Models Adapted from Demuth & Beale (2004) Neural Network Toolbox For Use with MATLAB®, The Mathworks, Natick, MA Artificial Neural Networks (ANN) Artificial neural networks (ANN) are computational models inspired from the nature of neurons. Neurons are inter-connected and arranged in layers. Can be of very different nature. The number of parameters (degrees of freedom) is normally quite high. Essentially a “black-box” model fit for a purpose but difficult to interpret. Normally, ANNs have problems to generalise as is the case with other data driven modelling approaches. Wesolowski & Suchacz (2012) Artificial Neural Networks: theoretical background and pharmaceutical applications: a review, J. AOAC Internat., 95(3) 652-668(17) Prediction ANNs can be of different nature and serve different purposes Linear Filter Feedforward Elman (dynamic) Classification Perceptron Self Organizing Map (Kohonen) Learning Vector Quantization (hybrid) Adapted from Demuth & Beale (2004) Neural Network Toolbox For Use with MATLAB®, The Mathworks, Natick, MA Radial Basis Hopfield (dynamic) ANNs’ issues To discover the best ANN model, numerous candidate models must be trained and evaluated. Problem is that there are a number of choices. • Network type (feedforward, Kohonen, Elman, etc.) • Network topology/size/structure (static, dynamic, etc.) • Learning algorithm • Learning procedure (batch, incremental, etc.) • Learning algorithm parameters (learning rate, regularization, etc.) • Fitting function • Stopping procedure/validation Special precautions are to be taken to avoid erroneous conclusions. • Network initialization • Over-fitting • Model selection Verikas & Bacauskiene (2003) Using artificial neural networks for process and system modelling, Chemom. Intell. Lab. Syst. ,67,187– 191 ANNs don’t do miracles ANNs estimate patterns or relationships (when properly used) directly from data. Therefore, ANNs don’t do miracles*!! If the data is of bad quality, ANNs will generate spurious relationships and these must be detected. NIR spectrum Rate Diffuse Reflectance Euribor 6 months rate Time Wavelength One of the above profiles is a near infrared spectrum taken from a pharmaceutical product (powder mix). Which one is it? *Incorporation of additional knowledge might be coupled with ANNs and these models are generally called “hybrid models”. Number of publications involving “neural networks” in the pharma field Source: Web of Science (Thompson, Reuters) Note (search strings): TS=(neural networks) AND PY=#### Pharma encompasses circa 0.7% of all ANN related publications TS=(neural networks) AND TS=(drug discovery OR pharmaceutics OR pharmaceutical industry OR pharmaceutical OR pharmaceutical technology OR drug design OR drug synthesis OR drug production) AND PY=#### ANNs applied to pharmaceutical sciences [i] DRUG DESIGN & DRUG DISCOVERY • Analysis of structure–activity data (analysis of multi-dimensional data) • Establishment of quantitative structure–activity relationships (QSAR) • Combinatorial discovery • Toxicity modelling • Gene prediction • Locating protein-coding regions in DNA sequences • 3D structure alignment • Pharmacophore perception • Docking of ligands to receptors • Automated generation of small organic compounds • Design of combinatorial libraries Terfloth & Gasteiger (2001) Neural networks and genetic algorithms in drug design, Drug Discov. Today, 6(15), 102-108 Winkler (2004) Neural networks as robust tools in drug lead discovery and development, Mol. Biotechnol. 27(2),139-68 Livingstone & Salt (1992) Regression analysis for QSAR using neural networks. Bioorg. Med. Chem. Lett. 2, 213–218 ANNs applied to pharmaceutical sciences [ii] DRUG DELIVERY • Structure Retention Relationship (SRR) methodology in pharmacological research • Pharmacokinetics and pharmacodynamics modelling • Preformulation • Optimization of pharmaceutical formulations • Quantitative Structure-Activity Relationships (QSAR) and Quantitative Structure-Property Relationship (QSPR) • In Vitro In Vivo correlations • Proteomics and genomics • Diagnosis of disease Sutariyaa et al. (2013) Artificial Neural Network in Drug Delivery and Pharmaceutical Research, The Open Bioinform J., 7, 49-62 Ibrić et al. (2003) Artificial neural networks in the modeling and optimization of aspirin extended release tablets with eudragit L 100 as matrix substance, AAPS PharmSciTech., 4(1), 62–70. Bourquin et al. (1997) Basic concepts of artificial neural networks (ANN) modeling in the application to pharmaceutical development, Pharm. Dev. Technol., 2(2), 95-109 Takayama et al. (1999) Artificial Neural Network as a Novel Method to Optimize Pharmaceutical Formulations, Pharm. Res.,16(1), 1-6 ANNs applied to pharmaceutical sciences [iii] PHARMACEUTICAL TECHNOLOGY/PRODUCTION • Drugs production: synthesis (biological, chemical, etc.) • Drugs production: downstream operations (isolation, purification,etc.) • Processes modelling (granulation, coating, tabletting, mixture, etc.) 1 1 2 Mendyk et al. (2015) From Heuristic to Mathematical Modeling of Drugs Dissolution Profiles: Application of Artificial Neural Networks and Genetic ProgrammingComputational and Mathematical Methods in Medicine, Article ID 863874, 9 pp. Examples Case [i] Micro scale (research & development) Predicting properties of polymeric nanoparticles for cancer vaccines manufacturing Case [ii] Laboratory scale (polymers manufacturing) Monitoring a laboratory scale bioreactor for polyhydroxybutyrates production Case [iii] Industrial scale (antibiotics manufacturing) Modelling an industrial scale fermentation process for antibiotic production Case [i] Predicting properties of polymeric nanoparticles for cancer vaccines manufacturing Objective: to predict the properties of polymeric nanoparticles (average size around 150nm) to be used as cancer vaccine from manufacturing parameters. Feedforward ANN INPUTS: Manufacturing parameters (PVA, Glycol Chitosan, Surfactant) OUTPUTS: Nanoparticles properties (Size, Zeta Potential, Polydispersity Index, Encapsulation Efficiency, Loading Capacity) Case [i] Experimental design (22 formulations) Case [i] Manufacturing conditions impact on nanoparticles properties Influence of input parameters on the estimation of particles average size (Z-average) Influence of PVA and Glycol Chitosan Influence of surfactants (external phase) Case [ii] Monitoring a laboratory scale bioreactor for polyhydroxybutyrates (PHB) production Case [ii] The process was monitored in-situ and in real-time with near infrared spectroscopy batches: 4 batch cycles (A,B,C,D) On-line variables: Time (h) O2(%) pH NIR spectra (900-1700 nm) Off-line variables: Ammonia (N-mmol.L-1) PHB (C-mmol.L-1) PHV (C-mmol.L-1) Biomass (C-mmol.L-1) Near infrared spectroscopy (NIR) Feedforward ANN Case [ii] NIR-based ANN model predictions Near infrared spectra Raw Processed ANN predicted [PHB] (C-mmol.L-1) Predictions of [PHB] Test batch R2>0.9 Experimental [PHB] (C-mmol.L-1) Case [iii] Modelling an industrial scale fermentation process for antibiotic production Inoculum Raw-materials Fermentation Case [iii] Fermentations are intrinsically complex processes the cell factory organelles 0.2-10 mm intracellular antibiotic secretion a -AAA CYS VAL nutrients uptake microbial filament 50 mm microbial pellet 1 mm CELL WALL PenG AT PhCO~CoA IPN IPNS ACVS fermentation tank 100 m3 A. simplification of the major metabolic pathways in a cell B. non-localized biosynthesis in industrial strains C. filamentous antibiotic-producing microorganisms in submerged culture D. diffusional limitations inside pellets add to the biochemical complexity E. antibiotic production takes several days in large fed-batch stirred tank reactors Case [iii] Fermentation process description Additions Fermentation Lyophilized spores Selection Vegetative culture Agar culture (glucose syrup, phenylacetic acid,animal and vegetable oils, ammonia and ammonium sulphate) Inoculum • Stagewise processes (scale-up of inoculum) Inoculum (1.5 days) Fermentation (10 days) • Biomass growth in pellets • Slow non-linear time varying processes • History dependent processes • Natural variability of complex raw materials • Consistent control hampered by monitoring and many intrinsic difficulties of bioprocesses Inefficient operation control Case [iii] Objectives Oxygen (air) Biomass Substrates Precursor Product (antibiotic) • On-Line (Feed Rates, O2, CO2,Pressure, pH, Temperature, Air Flow, Derived Process Variables) • Off-Line (Biomass, Substrates and Product Concentrations, Physical Parameters) Case [iii] Major challenges To model a complex system with almost no knowledge about process kinetics. Industrial fermentation process with complex raw materials (variable composition) and instrinsic variability caused by bacteria. Process with a total duration of approximately 140 hours. Many operating variables that are changed over time according to an established policy. Historical data not appropriate to capture process dynamics (too many similar batches) with lots of missing data. Available neural network types not ideal to model batch processes. Limited number of experimentally designed batches (industrial scale tests). Case [iii] Solutions Experimental design of a series of 6 fermentations at industrial scale with variations in all input process variables (using a genetic algorithm). Runnning all 6 experiments and collecting the maximum amount of data with more frequency during the first 24 process hours. Adapt feedforward ANNs to handle batch data (new training algorithm and topology). Implementing the predictive ANNs and integrating predictions in the process monitoring console (Gensym G2 software). Case [iii] “Static” versus “Recurrent” ANNs Standard ANN Recurrent ANN Outputs are used as inputs at the next prediction step A simple recurrence rule Case [iii] “Static” versus “Recurrent” ANNs Feedforward ANN (“static”) Feedforward ANN (“recurrent”) Case [iii] ANNs predictions Dynamic/Recurrent ANN topology ANN inputs are state variables (typically concentrations), off-gas measurements, physical variables (pH, temperature), feedrates and aeration Time (h) Solid lines are ANN estimations and markers are experimentally observed data Precursor Conc. (g.L-1) API Conc. (a.u.) Biomass (g.L-1) Biomass (g.L-1) Time (h) Precursor Conc. (g.L-1) Test Batch B API Conc. (a.u.) Test Batch A Case [iii] Managing ANNs use and ANNs’ estimations A small set of rules used essentially to detect different process stages Case [iii] Monitoring console Feed rates, volume and tank concentrations monitoring An instance of an animated fermenter object On-Line monitored data Network prediction Fermenter attributes Locates the last saved data file related to the chosen tank 10 different tanks can be monitored