Identification and Neural Networks G. Horváth I S R G Department of Measurement and Information Systems 10. 10. 2001 NIMIA Crema, Italy 1 Jump to first page Identification and Neural Networks Part III Industrial application http://www.mit.bme.hu/~horvath/nimia 10. 10. 2001 NIMIA Crema, Italy 2 Jump to first page Overview Introduction Modeling approaches Building neural models Data base construction Model selection Modular approach Hybrid approach Information system Experiences with the advisory system Conclusions 10. 10. 2001 NIMIA Crema, Italy 3 Jump to first page Introduction to the problem Task to develop an advisory system for operation of a Linz-Donawitz steel converter to propose component composition to support the factory staff in supervising the steelmaking process A model of the process is required 10. 10. 2001 NIMIA Crema, Italy 4 Jump to first page LD Converter modeling 10. 10. 2001 NIMIA Crema, Italy 5 Jump to first page Linz-Donawitz converter Phases of steelmaking 1. Filling of waste iron 2. Filling of pig iron 3. Blasting with pure oxygen 4. Supplement additives 5. Sampling for quality testing 6. Tapping of steel and slag 10. 10. 2001 NIMIA Crema, Italy 6 Jump to first page Linz-Donawitz converter Phases of steelmaking 1. Filling of waste iron 2. Filling of pig iron 3. Blasting with pure oxygen 4. Supplement additives 5. Sampling for quality testing 6. Tapping of steel and slag 10. 10. 2001 NIMIA Crema, Italy 7 Jump to first page Linz-Donawitz converter Phases of steelmaking 1. Filling of waste iron 2. Filling of pig iron 3. Blasting with pure oxygen 4. Supplement additives 5. Sampling for quality testing 6. Tapping of steel and slag 10. 10. 2001 NIMIA Crema, Italy 8 Jump to first page Linz-Donawitz converter Phases of steelmaking 1. Filling of waste iron 2. Filling of pig iron 3. Blasting with pure oxygen 4. Supplement additives 5. Sampling for quality testing 6. Tapping of steel and slag 10. 10. 2001 NIMIA Crema, Italy 9 Jump to first page Linz-Donawitz converter Phases of steelmaking 1. Filling of waste iron 2. Filling of pig iron 3. Blasting with pure oxygen 4. Supplement additives 5. Sampling for quality testing 6. Tapping of steel and slag 10. 10. 2001 NIMIA Crema, Italy 10 Jump to first page Linz-Donawitz converter Phases of steelmaking 1. Filling of waste iron 2. Filling of pig iron 3. Blasting with pure oxygen 4. Supplement additives 5. Sampling for quality testing 6. Tapping of steel and slag 10. 10. 2001 NIMIA Crema, Italy 11 Jump to first page Main parameters of the process Nonlinear input-output relation between many inputs and two outputs input parameters (~50 different parameters) The main output parameters certain features “measured” during the process temperature (1640-1700 CO -10 … +15 CO) carbon content (0.03 - 0.70 % ) More than 5000 records of data 10. 10. 2001 NIMIA Crema, Italy 12 Jump to first page Modeling task The difficulties of model building High complexity nonlinear input-output relationship No (or unsatisfactory) physical insight Relatively few measurement data There are unmeasurable parameters Noisy, imprecise, unreliable data Classical approach (heat balance, mass balance) gives no acceptable results 10. 10. 2001 NIMIA Crema, Italy 13 Jump to first page Modeling approaches Theoretical model - based on chemical, physical equations Input - output behavioral model Neural model - based on the measured process data Rule based system - based on the experimental knowledge of the factory staff Combined neural - rule based system 10. 10. 2001 NIMIA Crema, Italy 14 Jump to first page The modeling task oxygen components (parameters) System temperature + e S Neural Model components (parameters) measured temperature predicted temperature Copy of Inverse Model predicted Model oxygen e model output temperature + S - 10. 10. 2001 NIMIA Crema, Italy 15 Jump to first page „Neural” solution The steps of solving a practical problem Raw input data Preprocessing Neural network Postprocessing Results 10. 10. 2001 NIMIA Crema, Italy 16 Jump to first page Building neural models Creating a reliable database the problem of noisy data the problem of missing data the problem of uneven data distribution Selecting a proper neural architecture static network dynamic network regressor selection Training and validating the model 10. 10. 2001 NIMIA Crema, Italy 17 Jump to first page Creating a reliable database Input components measure of importance physical insight sensitivity analysis principal components Normalization input normalization output normalization Missing data artificially generated data Noisy data preprocessing, filtering 10. 10. 2001 NIMIA Crema, Italy 18 Jump to first page Building database Selecting input components, dimension reduction Initial database Neural network training New database Sensitivity analysis Input parameter cancellation Input parameter of small effect on the output? yes no 10. 10. 2001 NIMIA Crema, Italy 19 Jump to first page Building database Dimension reduction: mathematical methods PCA Non-linear PCA ICA Combined methods 10. 10. 2001 NIMIA Crema, Italy 20 Jump to first page Data compression, PCA networks Principal component analysis (Karhunen-Loeve transformation y2 x2 y1 x1 10. 10. 2001 NIMIA Crema, Italy 21 Jump to first page Oja network Linear feed-forward network Input x1 y =w T x x2 x3 xN S y Output w Feed-forward weight vector 10. 10. 2001 NIMIA Crema, Italy 22 Jump to first page Oja network Learning rule Normalized Hebbian learning wi μyxi ywi w μyx yw 10. 10. 2001 NIMIA Crema, Italy 23 Jump to first page Oja subspace network Multi-output extension W y xT y yT W 10. 10. 2001 NIMIA Crema, Italy 24 Jump to first page GHA, Sanger network Multi-output extension Oja rule + Gram-Schmidt orthogonalization w1 y1x(1) y12 w1 x( 2) x(1) w1T x(1) w1 x(1) y1w1 w2 y2x( 2) y22 w2 y2x(1) y1 y2 w1 y22 w2 wi yi x (1) yi2 w i = yi x (1) y1 y2 w i ... yi2 wi (1) i 1 2 = yi x yi y j w i yi wi j 1 W yxT LTyyT W 10. 10. 2001 NIMIA Crema, Italy 25 Jump to first page Nonlinear data compression Nonlinear principal components x2 x1 y1 10. 10. 2001 NIMIA Crema, Italy 26 Jump to first page Independent component analysis A method of finding a transformation where the transformed components are statistically independent Applies higher order statistics Based on the different definitions of statistical independence The typical task x As B A 1 s Bx x As n Can be implemented using neural architecture 10. 10. 2001 NIMIA Crema, Italy 27 Jump to first page Normalizing Data Typical data distributions 70 1000 900 60 800 50 700 600 40 500 30 400 300 20 200 10 100 0 0 10 10. 10. 2001 20 30 40 50 60 0 1600 1620 1640 1660 1680 1700 1720 NIMIA Crema, Italy 1740 28 Jump to first page Normalization Zero mean, unit standard deviation 1 P ( p) xi xi P p 1 i2 1 P ( p) 2 ( xi xi ) P 1 p1 Normalization into [0,1] ~ xi ~ xi( p ) xi( p ) xi i xi min{xi } max{xi } min{xi } Decorrelation + normalization 1 P ( p) Σ (x x )(x( p ) x )T P 1 p 1 ~ x( p) 1/ 2ΦT (x( p) x) 10. 10. 2001 diag(1...N ) Σ j λ j j Φ 1 2 N T NIMIA Crema, Italy 29 Jump to first page Normalization Decorrelation + normalization = Whitening transformation Whitened Original 10. 10. 2001 NIMIA Crema, Italy 30 Jump to first page Missing or few data Filling in the missing values ~ C(i, j ) C(i, j ) C(i, i) C( j, j ) xˆi xi i xˆi( h ) fˆ ( x (jh ) ) Ri (t, ) E{xi (t ) xi (t )} Artificially generated data using trends using correlation using realistic transformations 10. 10. 2001 NIMIA Crema, Italy 31 Jump to first page Few data Artificial data generation using realistic transformations using sensitivity values: data generation around various working points (a good example: ALVINN) 10. 10. 2001 NIMIA Crema, Italy 32 Jump to first page Noisy data EIV input and output noise are taken into consideration modified criterion function SVM e-insensitive criterion function Inherent noise suppression classical neural nets have noise suppression property (inherent regularization) averaging (modular approach) 10. 10. 2001 NIMIA Crema, Italy 33 Jump to first page Errors in variables (EIV) Handling of noisy data xk* n[pi,]k System [i ] nm , x , k yk* n[mi ], y , k yk[i ] [i ] xk 1 xk M x M i 1 xk* x[ki ] n x ,k 1 yk M y M y[ki] 2 xy ,k 10. 10. 2001 x2,k i 1 yk* n y ,k 1 M [i ] ( xk xk ) 2 M 1 i 1 nx ,k 1 M M i 1 y2,k n[xi,]k 1 M [i ] ( yk y k ) 2 M 1 i 1 n y ,k 1 M M n[yi,]k i 1 1 M [i ] ( xk xk )( y[ki ] yk ) M 1 i 1 NIMIA Crema, Italy 34 Jump to first page EIV LS vs EIV criterion function 1 N * ( yk f NN ( xk* , W))2 N k 1 CLS CEIV M N ( y f ( x , W))2 ( x x ) 2 k 2 k k NN2 k y ,k u ,k k 1 N EIV training M W j 2N e f ,k f NN ( xk , W ) 2 W j k 1 y ,k N M xk 2 e f ,k f ( x , W) ex,k NN k 2 2 xk x,k y,k e f ,k yk f NN ( xk , W) 10. 10. 2001 NIMIA Crema, Italy 35 Jump to first page EIV Example 10. 10. 2001 NIMIA Crema, Italy 36 Jump to first page EIV Example 1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1 -1 10. 10. 2001 -0.5 0 0.5 1 1.5 NIMIA Crema, Italy 37 Jump to first page SVM Why SVM? „Classical” Neural Networks (MLP) -„Overfitting” - Model - Structure - Parameter 10. 10. 2001 Selection difficulties Support Vector Machine (SVM) +Better generalization (upper bounds) +Selects the more important input samples +Handles noise +~Automatic structure and parameter selection NIMIA Crema, Italy 38 Jump to first page SVM Special problem of SVM selecting hyperparameters e insensitive RBF type SVM: , C slow „training”, complex computations SVM-Light Smaller, reduced teaching set difficulty of real-time adaptation 10. 10. 2001 NIMIA Crema, Italy 39 Jump to first page Selecting the optimal parameters C=1, e=0.05, σ=0.9 1.5 1 0.5 0 -0.5 C=1, e=0.05, σ =1.9 -1 -1.5 0 1 2 3 4 5 1.5 1 0.5 0 -0.5 -1 -1.5 10. 10. 2001 0 1 2 3 4 5 NIMIA Crema, Italy 40 Jump to first page Selecting the optimal parameters Sigma 10. 10. 2001 NIMIA Crema, Italy 41 Jump to first page Selecting the optimal parameters Mean square error Sigma 10. 10. 2001 NIMIA Crema, Italy 42 Jump to first page Comparison of SVM, EIV and NN 1.4 1.2 EIV-SVM comparison f(x)=sin(x)/x Training points Support vectors Training result of the SVM Training result with EIV Training result with MLP 1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -10 10. 10. 2001 -8 -6 -4 -2 0 2 4 6 8 10 NIMIA Crema, Italy 43 Jump to first page Model selection Static or dynamic Dynamic model class regressor selection basis function selection Size of the network number of layers number of hidden neurons model order 10. 10. 2001 NIMIA Crema, Italy 44 Jump to first page Model selection NARX model, NOE model yk f x(k ), x(k 1), x(k 2),...,x(k n), y(k 1), y(k 2),...,y(k m) Lipschitz number, Lipschitz quotient 1/ p p n n q n q (k ) k 1 12 qij yi yi , xi x j 18 11 16 10 14 9 8 12 7 10 6 8 5 4 1 2 10. 10. 2001 3 4 5 6 7 8 9 6 0 5 10 15 20 NIMIA Crema, Italy 45 Jump to first page Model selection Lipschitz quotient general nonlinear input-output relation, f(.) continuous, smooth multivariable function y f x1,x2 ,...,xn bounded derivatives f fi M xi ' i 1, 2, ... , n Lipschitz quotient qij yi yi xi x j , i j 0 qij L Sensitivity analysis y f f f x1 x2 xn f1' x1 f 2' x2 f n' xn x1 x2 xn 10. 10. 2001 NIMIA Crema, Italy 46 Jump to first page Model selection Lipschitz number qij( n) qij( n1) y x1 x2 2 2 y x1 2 x2 xn1 2 2 ; xn 2 qij( n1) nM y x1 2 x2 2 xn1 2 1/ p n n q n q (k ) k 1 p p 0.01N 0.02N q n (k ) k th largest Lipchitz quotient among all qij( n) (i j; i, j 1,2,..., N ) for optimal n 10. 10. 2001 qn1 qn qn1 qn NIMIA Crema, Italy 47 Jump to first page Modular solution Ensemble of networks linear combination of networks Mixture of experts using the same paradigm (e.g. neural networks) using different paradigms (e.g. neural networks + symbolic systems) Hybrid solution expert systems neural networks physical (mathematical) models 10. 10. 2001 NIMIA Crema, Italy 48 Jump to first page Cooperative networks Ensemble of cooperating networks (classification/regression) The motivation Heuristic explanation Different experts together can solve a problem better Complementary knowledge Mathematical justification 10. 10. 2001 Accurate and diverse modules NIMIA Crema, Italy 49 Jump to first page Ensemble of networks Mathematical justification M Ensemble output y x, Ambiguity (diversity) a j x y j (x) y x, Individual error e j x d (x) y j x Ensemble error e(x) d (x) y x, Constraint 10. 10. 2001 x j 0 2 2 2 jyj j j 1 NIMIA Crema, Italy 50 Jump to first page Ensemble of networks Mathematical justification (cont’d) M e x, j e j x Weighted error j 0 M a x, j a j x Weighted diversity j 0 e(x) d (x) y x, e (x, ) a x, Ensemble error Averaging over the input distribution E e(x, ) f (x)dx x 2 E e(x, ) f (x)dx x A a (x, ) f (x)dx x EEA Solution: Ensemble of accurate and diverse networks 10. 10. 2001 NIMIA Crema, Italy 51 Jump to first page Ensemble of networks How to get accurate and diverse networks different structures: more than one network structure (e.g. MLP, RBF, CCN, etc.) different size, different complexity networks (number of hidden units, number of layers, nonlinear function, etc.) different learning strategies (BP, CG, random search,etc.) batch learning, sequential learning different training algorithms, sample order, learning samples different training parameters different initial parameter values different stopping criteria 10. 10. 2001 NIMIA Crema, Italy 52 Jump to first page Linear combination of networks Fixed weights x y0=1 NN1 y1 α1 NN2 y2 α2 α0 y x, Σ M jyj x j 0 αM yM NNM 10. 10. 2001 NIMIA Crema, Italy 53 Jump to first page Linear combination of networks Computation of optimal coefficients k 1 , M k 1 ... M simple average k 1, j 0, j k , k depends on the input for different input domains different network (alone gives the output) optimal values using the constraint k 1 optimal values without any constraint Wiener-Hopf equation R y E yx yx T 10. 10. 2001 *1 R y1P P E yx d x NIMIA Crema, Italy 54 Jump to first page Mixture of Experts (MOE) μ Σ g1 Gating network g2 μ1 Expert 1 gM Expert 2 Expert M x 10. 10. 2001 NIMIA Crema, Italy 55 Jump to first page Mixture of Experts (MOE) The output is the weighted sum of the outputs of the experts M M μ gi μi μi f (x, Θi ) i 1 g i 1 i 1 gi 0 i i is the parameter of the i-th expert The output of the gating network: “softmax” function gi T i ei M e j i v iT x j 1 v is the parameter of the gating network 10. 10. 2001 NIMIA Crema, Italy 56 Jump to first page Mixture of Experts (MOE) Probabilistic interpretation gi P(i| x, v i ) μi E[y | x, i ] the probabilistic model with true parameters P(y | x, 0 ) g i (x, v i0 ) P(y | x, i0 ) i a priori probability 10. 10. 2001 gi (x, v i0 ) P(i| x, v i0 ) NIMIA Crema, Italy 57 Jump to first page Mixture of Experts (MOE) Training Training data X x , y Probability of generating output from the input (l ) (l ) L l 1 P( y ( l ) | x ( l ) , ) P(i| x ( l ) , v i ) P( y ( l ) | x ( l ) , i ) i (l ) (l ) (l ) P( y| x, ) P( y | x , ) P(i| x , v i ) P( y | x , i ) l 1 l 1 i L L (l ) (l ) The log likelihood function (maximum likelihood estimation) L(x, ) log P(i | x(l ) , vi ) P(y (l ) | x(l ) , i ) l i 10. 10. 2001 NIMIA Crema, Italy 58 Jump to first page Mixture of Experts (MOE) Training (cont’d) Gradient method (x, ) 0 vi (x, ) 0 i and The parameter of the expert network L i ( k 1) i ( k ) h ( y l 1 (l ) i (l ) i i) i The parameter of the gating network L v i ( k 1) v i ( k ) hi( l ) gi( l ) x ( l ) l 1 10. 10. 2001 NIMIA Crema, Italy 59 Jump to first page Mixture of Experts (MOE) Training (cont’d) A priori probability gil gi (xl , vi ) P(i | xl , vi ) A posteriori probability hi l l g i P ( y l x l , i ) l l l g P ( y x , j ) j j 10. 10. 2001 NIMIA Crema, Italy 60 Jump to first page Mixture of Experts (MOE) Training (cont’d) EM (Expectation Maximization) algorithm A general iterative technique for maximum likelihood estimation Introducing hidden variables Defining a log likelihood function Two steps: 10. 10. 2001 Expectation of the hidden variables Maximization of the log likelihood function NIMIA Crema, Italy 61 Jump to first page EM (Expectation Maximization) A simple example: estimating means of k (2) Gaussians f (y|µ1) f (y|2) Measurements 10. 10. 2001 NIMIA Crema, Italy 62 Jump to first page EM algorithm A simple example: estimating means of k (2) Gaussians hidden variables for every observation, (l ) (l ) (l ) (1) z 1 and z 0 if x X (l) (l) (l) 1 2 (x , z , z ) 1 2 z1(l ) 0 likelihood function f (x z2(l ) 1 if x(l ) X (2) and (l ) i ) f ( x (l ) , zi(l ) k i ) ( f ( x i 1 (l ) i ) zi( l ) Log likelihood function k L log f ( x (l ) , zi(l ) i ) zi(l ) log( f ( x (l ) i ) i 1 expected value of zi(l ) with given 1 and 2 E z1(l ) f ( x x (l ) 1 ) 2 f (x x j 1 10. 10. 2001 (l ) j) E z2(l ) f ( x x (l ) 2 ) 2 (l ) f (x x j ) j 1 NIMIA Crema, Italy 63 Jump to first page Mixture of Experts (MOE) A simple example: estimating means of k (2) Gaussians Expected log likelihood function k E[L] E[ zi(l ) ] log( f i 1 (x (l ) f ( x x (l ) i ) k i ) i 1 2 (l ) f (x x j ) j 1 where f (x x (l ) 1 x i 2 i ) exp[ ] 2 2 2 2 1 1 x i( p ) (l ) log f ( x i ) log 2 2 2 2 1 The estimate of the means 10. 10. 2001 log( f ( x (l ) i ) 2 1 L (l ) ˆ i x E[ zi(l ) ] L l 1 NIMIA Crema, Italy 64 Jump to first page Hybrid solution Utilization of different forms of information measurement, experimental data symbolic rules mathematical equations, physical knowledge 10. 10. 2001 NIMIA Crema, Italy 65 Jump to first page The hybrid information system Solution: integration of measurement information and experimental knowledge about the process results Realization: development system – supports the design and testing of different hybrid models advisory system hybrid models using the current process state and input information, experiences collected by the rule-base system can be used to update the model. 10. 10. 2001 NIMIA Crema, Italy 66 Jump to first page The hybrid-neural system Oxygen prediction No prediction (explanation) Output expert system Mixture of experts system O 1 Control NN 1 O K O 2 NN K NN 2 ... OSZ Output estimator expert system O Correction term expert system Input data preparatory expert system Input data 10. 10. 2001 NIMIA Crema, Italy 67 Jump to first page The hybrid-neural system Data preprocessing and correction Neural Model Data preprocessing Input data 10. 10. 2001 NIMIA Crema, Italy 68 Jump to first page The hybrid-neural system Conditional network running O1 O2 NN NN 1 2 Ok NN k Expert for selecting a neural model Input data 10. 10. 2001 NIMIA Crema, Italy 69 Jump to first page The hybrid-neural system Ox. prediction Output expert O1 O2 Ok NN NN 1 2 NN k Expert for selecting an NN model Parallel network running postprocessing Input data 10. 10. 2001 NIMIA Crema, Italy 70 Jump to first page The hybrid-neural system Iterative network running Neural network running, prediction making N Result satisfactory Y Modification of input parameters 10. 10. 2001 NIMIA Crema, Italy 71 Jump to first page The hybrid information system Data table management Filters Neural network module Expert system module 10. 10. 2001 Analysis user interface Filters user interface User Hard disk Analysis module Neural networks user interface Expert system user interface NIMIA Crema, Italy 72 Jump to first page The structure of the system User Real time display system User interface controller C o n t r o l l e r Process and oxygen models (hybrid neural expert models) Data filtering Data conversion Services , (explanation help, etc.) Result verification, model maintenance, model adaptation. Process control system and database system interface Process control and database servers 10. 10. 2001 NIMIA Crema, Italy 73 Jump to first page 10. 10. 2001 NIMIA Crema, Italy 74 Jump to first page Validation Model selection iterative process utilization of domain knowledge Cross validation fresh data on-site testing 10. 10. 2001 NIMIA Crema, Italy 75 Jump to first page Experiences The hit rate is increased by + 10% Most of the special cases can be handled Further rules for handling special cases should be obtained The accuracy of measured data should be increased 10. 10. 2001 NIMIA Crema, Italy 76 Jump to first page Conclusions For complex industrial problems all available information have to be used Thinking about NNs as universal modeling devices alone Physical insight is important The importance of preprocessing and post-processing Modular approach: decomposition of the problem cooperation and competition “experts” using different paradigms The hybrid approach to the problem provided better results 10. 10. 2001 NIMIA Crema, Italy 77 Jump to first page References and further readings Pataki, B., Horváth, G., Strausz, Gy., Talata, Zs. "Inverse Neural Modeling of a Linz-Donawitz Steel Converter" e & i Elektrotechnik und Informationstechnik, Vol. 117. No. 1. 2000. pp. Strausz, Gy., G. Horváth, B. Pataki : "Experiences from the results of neural modelling of an industrial process" Proc. of Engineering Application of Neural Networks, EANN'98, Gibraltar 1988. pp. 213-220 Strausz, Gy., G. Horváth, B. Pataki : "Effects of database characteristics on the neural modeling of an industrial process" Proc. of the International ICSC/IFAC Symposium on Neural Computation / NC’98, Sept. 1998, Vienna pp. 834-840. Horváth, G., Pataki, B. Strausz, T.: "Neural Modeling of a Linz-Donawitz Steel Converter: Difficulties and Solutions" Proc. of the EUFIT'98, 6th European Congress on Intelligent Techniques and Soft Computing. Aachen, Germany. 1998. Sept. pp.1516-1521 Horváth, G. Pataki, B. Strausz, Gy.: "Black box modeling of a complex industrial process", Proc. Of the 1999 IEEE Conference and Workshop on Engineering of Computer Based Systems, Nashville, TN, USA. 1999. pp. 60-66 Bishop, C, M.: “Neural Networks for Pattern Recognition” Clanderon Press, Oxford, 1995. Berényi, P.,, Horváth, G., Pataki, B., Strausz, Gy. : "Hybrid-Neural Modeling of a Complex Industrial Process" Proc. of the IEEE Instrumentation and Measurement Technology Conference, IMTC'2001. Budapest, May 21-23. Vol. III. pp. 14241429. Berényi P., Valyon J., Horváth, G. : "Neural Modeling of an Industrial Process with Noisy Data" IEA/AIE-2001, The Fourteenth International Conference on Industrial & Engineering Applications of Artificial Intelligence & Expert Systems, June 4-7, 2001, Budapest in Lecture Notes in Computer Sciences, 2001, Springer, pp. 269-280. Jordan, M. I., Jacobs, R. A.: “Hierarchical Mixture of Experts and the EM Algorithm” Neural Computation Vol. 6. pp. 181214, 1994. Hashem, S. “Optimal Linear Combination of Neural Networks” Neural Networks, Vol. 10. No. 4. pp. 599-614, 1997. Krogh, A, Vedelsby, J.: “Neural Network Ensembles Cross Validation and Active Learning” In Tesauro, G, Touretzky, D, Leen, T.Advances in Neural Information Processing Systems, 7. Cambridge, MA. MIT Press pp. 231-238. 10. 10. 2001 NIMIA Crema, Italy 78 Jump to first page