A bried overview on the assimilation of neural network (bio)pharmaceutical industry

advertisement
A bried overview on the
assimilation of neural network
based solutions by the
(bio)pharmaceutical industry
João Almeida Lopes
Faculdade de Farmácia da Universidade de Lisboa
ISEP, Porto - 2015.10.01
Summary
Part [i]
Some concepts around the Artificial Neural Networks theme
Part [ii]
Applications of Artificial Neural Networks in the pharmaceutical field (R&D)
Part [iii]
Illustration of the use of Artificial Neural Networks in three situations (pharmaceutical processes)
Machine Learning
Machine learning deals with the construction and study of systems that can learn from data.
A supervised learning algorithm analyzes the training data and produces and inferred function, which can be used for mapping new examples.
In unsupervized learning, no labels are given to the learning algorithm, leaving it on its own to group similar inputs, also called clustering.
Purposes:
• association
• classification (clustering)
• transformation (different representation)
• modelling
Artifical Neural Networks are Machine Learning Models
Adapted from Demuth & Beale (2004) Neural Network Toolbox For Use with MATLAB®, The Mathworks, Natick, MA
Artificial Neural Networks (ANN)
Artificial neural networks (ANN) are computational models inspired from the nature of neurons.
Neurons are inter-connected and arranged in layers.
Can be of very different nature.
The number of parameters (degrees of freedom) is normally quite high.
Essentially a “black-box” model fit for a purpose but difficult to interpret.
Normally, ANNs have problems to generalise as is the case with other data driven modelling approaches.
Wesolowski & Suchacz (2012) Artificial Neural Networks: theoretical background and pharmaceutical applications: a review, J. AOAC Internat., 95(3) 652-668(17)
Prediction
ANNs can be of different nature and serve different purposes
Linear Filter
Feedforward
Elman (dynamic)
Classification
Perceptron
Self Organizing Map (Kohonen)
Learning Vector Quantization (hybrid)
Adapted from Demuth & Beale (2004) Neural Network Toolbox For Use with MATLAB®, The Mathworks, Natick, MA
Radial Basis
Hopfield (dynamic)
ANNs’ issues
To discover the best ANN model, numerous candidate models must be trained and evaluated. Problem is that there are a number of choices.
• Network type (feedforward, Kohonen, Elman, etc.)
• Network topology/size/structure (static, dynamic, etc.)
• Learning algorithm
• Learning procedure (batch, incremental, etc.)
• Learning algorithm parameters (learning rate, regularization, etc.)
• Fitting function
• Stopping procedure/validation
Special precautions are to be taken to avoid erroneous conclusions.
• Network initialization
• Over-fitting
• Model selection
Verikas & Bacauskiene (2003) Using artificial neural networks for process and system modelling, Chemom. Intell. Lab. Syst. ,67,187– 191
ANNs don’t do miracles
ANNs estimate patterns or relationships (when properly used) directly from data. Therefore, ANNs don’t do miracles*!!
If the data is of bad quality, ANNs will generate spurious relationships and these must be detected.
NIR spectrum
Rate
Diffuse Reflectance
Euribor 6 months rate
Time
Wavelength
One of the above profiles is a near infrared spectrum taken from a pharmaceutical product (powder mix). Which one is it?
*Incorporation of additional knowledge might be coupled with ANNs and these models are generally called “hybrid models”.
Number of publications involving “neural networks” in the
pharma field
Source: Web of Science (Thompson, Reuters)
Note (search strings):
TS=(neural networks) AND PY=####
Pharma encompasses circa 0.7% of all ANN related publications
TS=(neural networks) AND TS=(drug discovery OR pharmaceutics OR pharmaceutical industry OR pharmaceutical OR pharmaceutical technology OR drug design OR drug synthesis OR drug
production) AND PY=####
ANNs applied to pharmaceutical sciences [i]
DRUG DESIGN & DRUG DISCOVERY
• Analysis of structure–activity data (analysis of multi-dimensional data)
• Establishment of quantitative structure–activity relationships (QSAR)
• Combinatorial discovery
• Toxicity modelling
• Gene prediction
• Locating protein-coding regions in DNA sequences
• 3D structure alignment
• Pharmacophore perception
• Docking of ligands to receptors
• Automated generation of small organic compounds
• Design of combinatorial libraries
Terfloth & Gasteiger (2001) Neural networks and genetic algorithms in drug design, Drug Discov. Today, 6(15), 102-108
Winkler (2004) Neural networks as robust tools in drug lead discovery and development, Mol. Biotechnol. 27(2),139-68
Livingstone & Salt (1992) Regression analysis for QSAR using neural networks. Bioorg. Med. Chem. Lett. 2, 213–218
ANNs applied to pharmaceutical sciences [ii]
DRUG DELIVERY
• Structure Retention Relationship (SRR) methodology in pharmacological research
• Pharmacokinetics and pharmacodynamics modelling
• Preformulation
• Optimization of pharmaceutical formulations
• Quantitative Structure-Activity Relationships (QSAR) and Quantitative Structure-Property Relationship (QSPR)
• In Vitro In Vivo correlations
• Proteomics and genomics
• Diagnosis of disease
Sutariyaa et al. (2013) Artificial Neural Network in Drug Delivery and Pharmaceutical Research, The Open Bioinform J., 7, 49-62
Ibrić et al. (2003) Artificial neural networks in the modeling and optimization of aspirin extended release tablets with eudragit L 100 as matrix substance, AAPS PharmSciTech., 4(1), 62–70.
Bourquin et al. (1997) Basic concepts of artificial neural networks (ANN) modeling in the application to pharmaceutical development, Pharm. Dev. Technol., 2(2), 95-109
Takayama et al. (1999) Artificial Neural Network as a Novel Method to Optimize Pharmaceutical Formulations, Pharm. Res.,16(1), 1-6
ANNs applied to pharmaceutical sciences [iii]
PHARMACEUTICAL TECHNOLOGY/PRODUCTION
• Drugs production: synthesis (biological, chemical, etc.)
• Drugs production: downstream operations (isolation, purification,etc.)
• Processes modelling (granulation, coating, tabletting, mixture, etc.)
1
1
2
Mendyk et al. (2015) From Heuristic to Mathematical Modeling of Drugs Dissolution Profiles: Application of Artificial Neural Networks and Genetic ProgrammingComputational and Mathematical Methods in Medicine,
Article ID 863874, 9 pp.
Examples
Case [i] Micro scale (research & development)
Predicting properties of polymeric nanoparticles for cancer vaccines manufacturing
Case [ii] Laboratory scale (polymers manufacturing)
Monitoring a laboratory scale bioreactor for polyhydroxybutyrates production
Case [iii] Industrial scale (antibiotics manufacturing)
Modelling an industrial scale fermentation process for antibiotic production
Case [i]
Predicting properties of polymeric nanoparticles for cancer
vaccines manufacturing
Objective:
to predict the properties of polymeric nanoparticles (average size around 150nm) to be used as cancer vaccine from manufacturing parameters.
Feedforward ANN
INPUTS: Manufacturing parameters
(PVA, Glycol Chitosan, Surfactant)
OUTPUTS: Nanoparticles properties
(Size, Zeta Potential, Polydispersity Index,
Encapsulation Efficiency, Loading Capacity)
Case [i]
Experimental design (22 formulations)
Case [i]
Manufacturing conditions impact on nanoparticles properties
Influence of input parameters on the estimation of particles average size (Z-average)
Influence of PVA and Glycol Chitosan
Influence of surfactants
(external phase)
Case [ii]
Monitoring a laboratory scale bioreactor for
polyhydroxybutyrates (PHB) production
Case [ii]
The process was monitored in-situ and in real-time with near
infrared spectroscopy
batches: 4 batch cycles (A,B,C,D)
On-line variables:
Time (h)
O2(%)
pH
NIR spectra (900-1700 nm)
Off-line variables:
Ammonia (N-mmol.L-1)
PHB (C-mmol.L-1)
PHV (C-mmol.L-1)
Biomass (C-mmol.L-1)
Near infrared spectroscopy (NIR)
Feedforward ANN
Case [ii]
NIR-based ANN model predictions
Near infrared spectra
Raw
Processed
ANN predicted [PHB] (C-mmol.L-1)
Predictions of [PHB]
Test batch
R2>0.9
Experimental [PHB] (C-mmol.L-1)
Case [iii]
Modelling an industrial scale fermentation process for antibiotic
production
Inoculum
Raw-materials
Fermentation
Case [iii]
Fermentations are intrinsically complex processes
the cell factory
organelles 0.2-10 mm
intracellular
antibiotic
secretion
a -AAA
CYS
VAL
nutrients uptake
microbial
filament
50 mm
microbial
pellet
1 mm
CELL WALL
PenG
AT PhCO~CoA
IPN
IPNS
ACVS
fermentation tank 100 m3
A. simplification of the major metabolic pathways in a cell
B. non-localized biosynthesis in industrial strains
C. filamentous antibiotic-producing microorganisms in submerged culture
D. diffusional limitations inside pellets add to the biochemical complexity
E. antibiotic production takes several days in large fed-batch stirred tank reactors
Case [iii]
Fermentation process description
Additions
Fermentation
Lyophilized
spores
Selection
Vegetative
culture
Agar
culture
(glucose syrup, phenylacetic acid,animal and
vegetable oils, ammonia and ammonium
sulphate)
Inoculum
• Stagewise processes (scale-up of inoculum)
Inoculum (1.5 days)  Fermentation (10 days)
• Biomass growth in pellets
• Slow non-linear time varying processes
• History dependent processes
• Natural variability of complex raw materials
• Consistent control hampered by monitoring and many intrinsic difficulties of bioprocesses
Inefficient operation control
Case [iii]
Objectives
Oxygen (air)
Biomass
Substrates
Precursor
Product
(antibiotic)
• On-Line (Feed Rates, O2, CO2,Pressure, pH, Temperature, Air Flow, Derived Process Variables)
• Off-Line (Biomass, Substrates and Product Concentrations, Physical Parameters)
Case [iii]
Major challenges
To model a complex system with almost no knowledge about process kinetics.
Industrial fermentation process with complex raw materials (variable composition) and instrinsic variability caused by bacteria.
Process with a total duration of approximately 140 hours.
Many operating variables that are changed over time according to an established policy.
Historical data not appropriate to capture process dynamics (too many similar batches) with lots of missing data.
Available neural network types not ideal to model batch processes.
Limited number of experimentally designed batches (industrial scale tests).
Case [iii]
Solutions
Experimental design of a series of 6 fermentations at industrial scale with variations in all input process variables (using a genetic algorithm).
Runnning all 6 experiments and collecting the maximum amount of data with more frequency during the first 24 process hours.
Adapt feedforward ANNs to handle batch data (new training algorithm and topology).
Implementing the predictive ANNs and integrating predictions in the process monitoring console (Gensym G2 software).
Case [iii]
“Static” versus “Recurrent” ANNs
Standard ANN
Recurrent ANN
Outputs are used as
inputs at the next
prediction step
A simple recurrence rule
Case [iii]
“Static” versus “Recurrent” ANNs
Feedforward ANN (“static”)
Feedforward ANN (“recurrent”)
Case [iii]
ANNs predictions
Dynamic/Recurrent ANN topology
ANN
inputs
are
state
variables
(typically
concentrations), off-gas measurements, physical
variables (pH, temperature), feedrates and aeration
Time (h)
Solid lines are ANN estimations and markers are experimentally observed data
Precursor Conc. (g.L-1)
API Conc. (a.u.)
Biomass (g.L-1)
Biomass (g.L-1)
Time (h)
Precursor Conc. (g.L-1)
Test Batch B
API Conc. (a.u.)
Test Batch A
Case [iii]
Managing ANNs use and ANNs’ estimations
A small set of rules used essentially to detect different process stages
Case [iii]
Monitoring console
Feed rates, volume and
tank concentrations
monitoring
An instance of
an animated
fermenter
object
On-Line monitored data
Network prediction
Fermenter
attributes
Locates the last
saved data file
related to the
chosen tank
10 different tanks can be monitored
Download