See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/266967350 Neural network modelling for the prediction of bainite and martensite start temperature in steels Article · January 2005 CITATION READS 1 333 10 authors, including: Carlos Garcia-Mateo Thomas Sourmail Centro Nacional de Investigaciones Metalúrgicas (CENIM) Ascometal CREAS - Swiss Steel Group 210 PUBLICATIONS 7,779 CITATIONS 89 PUBLICATIONS 3,381 CITATIONS SEE PROFILE SEE PROFILE Francisca G. Caballero Carlos Capdevila Spanish National Research Council Centro Nacional de Investigaciones Metalúrgicas (CENIM) 288 PUBLICATIONS 10,962 CITATIONS 259 PUBLICATIONS 5,153 CITATIONS SEE PROFILE All content following this page was uploaded by Carlos García de Andrés on 02 June 2015. The user has requested enhancement of the downloaded file. SEE PROFILE Solid-to-Solid Phase Transformations in Inorganic Materials 2005 Volume 2: Displacive TransformationsyFundamentals of Phase TransformationsyExperimental Approaches to the Study of Phase TransformationsyComputer Approaches to Simulations of Phase TransformationsyPhase Transformations in Novel Systems or Special Materials Edited by James M. Howe, David E. Laughlin, Jong K. Lee, Ulrich Dahem, and William A. Soffa TMS (The Minerals, Metals & Materials Society), 2005 NEURAL NETWORK MODELLING FOR THE PREDICTION OF BAINITE AND MARTENSITE START TEMPERATURE IN STEELS C. Garcia-Mateo1, T. Sourmail2, F.G. Caballero1, C. Capdevila1 and C. García de Andrés1 1 CENIM-CSIC.Dept. of Physical Metallurgy, Solid-Solid Phase Transformations Group (GITFES). Centro Nacional de Investigaciones Metalúrgicas (CENIM), Avda. Gregorio del Amo, 8, 28040 Madrid, Spain. 2 Cambridge University, Dept. Materials Science and Metallurgy, Phase Transformation Group. Pembroke St., Cambridge CB2 3QZ, UK.CGM71 Now at TWI Ltd. Keywords: Bainite and martensite start temperature, neural network, Bayesian framework Abstract The significance of Bainite and Martensite start temperatures is unquestionable for the steel community, therefore a significant amount of work has been devoted not only in understanding the physical mechanism lying beneath both transformations, but also to obtaining quantitatively accurate models for predicting both temperatures. Nowadays, with no limitations in computer power, rigorous and complex data analysis methods should be applied whenever it is possible. Thus, Neural Network analysis becomes a very attractive alternative, for being easily distributed, self-sufficient and for its ability of accompanying its predictions by an indication of their reliability Introduction The martensite and bainite start temperature, Ms and Bs, are defined as the highest temperature at which austenite transforms to martensite and bainite respectively. Due to the excellent combination of properties achieved by these microstructures and their wide range of industrial applicability, there is an understandable and considerable industrial interest in being able to predict reliably both temperatures. The exact values of these temperatures strongly depend on the chemical composition of the steel, and considerable work has been devoted to developing quantitative models for their compositional dependency. This has long been done using linear regressions or similar methods. We have classified these as non-adaptative because the `shape' of the function is pre-determined by the authors rather than adapted to the data. Therefore such methods have very limited ranges of applicability because of their inability to grasp interactions or non-linear effects. In contrast, neural network methods, as discussed later, are adaptative functions. On the other hand with the development of calculation frameworks such as CALPHAD(calculation of phase diagrams) [1], which allow prediction of thermodynamic properties of complex systems from data collected on simpler ones, more physically relevant approaches relying on the satisfaction of some thermodynamic criterion have gained importance [2,3]. This approach allows a much wider range of applicability than linear regression. Furthermore, the physical basis suggests that it should extrapolate relatively safely unless the mechanisms taken into account change significantly with composition, or the empirical thermodynamic data behave badly in extrapolation. Still these approaches suffer of some limitations, i.e in those models some of the thermodynamic criterions used for the calculation of Ms and Bs are model-dependent in the 867 sense that they are implicitly linked with the database that has been used during the derivation of the function to express its compositional dependency. This becomes a problem if different databases are used in deriving the criterion and in making predictions (or more exactly, if the different databases describe similar systems differently). With the increasing number of thermodynamics databases available, this problem cannot be neglected. In addition, the accuracy of the model may be limited by that of the underlying thermodynamic database, therefore the empirical component is not eliminated but displaced to lower levels of the model. Finally, making predictions requires access to expensive thermodynamic calculation software and databases. New empirical method such as neural network analysis offer attractive advantages, being not only easily distributed and self-sufficient but also able to cover arbitrarily large ranges of data. As any other method, their domain of applicability is somewhat determined by the data available at the time the model is defined. However, a feature unique to the method employed in the present work is the ability of the model to accompany its predictions by an indication of their reliability, this has been discussed further in ref 4. This method has been used with success to address a variety of problems, see for example ref. 5 Neural network modelling Method Neural networks, in the present context, essentially refer to non-linear multiple regression tools using adaptative functions. The following section will not detail the technique (see for example [4,6,7]), but presents the main features of the method. The typical structure of a neural network is presented in Figure 1. Figure 1. The typical structure of a neural network as used for non-linear multiple regressions. The first layer is made up by the inputs (1,.., xi), the second by so-called `hidden units' and the last one is the output. The hidden-units (the second-layer in Figure 1) take as input a weighted sum of the inputs and return its hyperbolic tangent: z j = tanh ∑ w ji xi j The third-layer combines these outputs using a linear superposition: 868 (1) y = ∑ω j z j (2) j where the wij and ωj are often referred to as the weights defining the network. ‘Training’ the network implies identifying an optimal set of weights, given some data for which the output is known. This is similar in principle to identifying the slope and intercept of the best fit line in a linear regression. The fundamental difference between this type of regression and linear regression is that neural networks correspond to adaptative functions. In traditional methods, the author fixes the form of the equation (for example, a second degree polynomial), and identifies the parameters that lead to optimal fitting of the observed data. Even in the few cases where the authors take the trouble to assess more than one function (for example, to determine whether a second or third degree polynomial is most appropriate), the extent to which the function is adapted to the data is very limited. With neural networks however, the complexity of the function is mainly controlled by the weights themselves, so that the optimisation includes a determination of the most suitable shape for the function. This flexibility is not without a drawback: overfitting is the cause of most problems in neural network modelling. Overfitting occurs when an overly complex function is chosen, so that the noise, rather than the trend in the data, is fitted by the function. One method widely applied to limit overfitting is to perform the optimisation on only one part of the data, then use the second part to determine which level of complexity best fits the data. This is illustrated in Figure 2. Figure 2. The problem of overfitting with neural networks can be avoided if only part of the data is used to optimise the network (here the filled circle). At this stage, the best solution appears as that which goes through all the filled circles. When using the second part of the dataset (crosses), it becomes obvious, however, that this solution is strongly overfitted the real trend and the real trend is better captured by a simpler model. Bayesian/Classical framework There are two ways to understand regression, whether in the context of neural networks or of linear, polynomial, etc. regression. The first and still most often encountered consists in defining an error function and minimising it by adjusting the parameters. We will refer to this method as the classical method. Bayesian probabilities offer a far more interesting approach by which the final model not only encompasses the knowledge present in the data, but also an estimation of the uncertainty on this knowledge. 869 Rather than identifying optimum parameters, an optimum probability distribution of parameters values is fitted to the data. In regions of space where data are sparce, this distribution will be wide, indicating that a number of solutions could fit the problem with similar probabilities. If a large amount of data is available, this distribution will be narrow indicating that one shape of function is significantly more probable than any other. Because it can be quantified, the uncertainty on the determination of the network parameters can be translated into an uncertainty on the prediction. This is illustrated in Figure 3. Figure 3. Illustration of the possibilities offered by Bayesian neural networks: the prediction can be accompanied by an error bar related to the uncertainty of fitting. When data are sparse, the uncertainty of fitting is larger than in region with sufficient data. Whether for linear regression or neural network, a bayesian approach should always be preferred because it allows predictions to be accompanied by an indication on the uncertainty. Because of the flexibility of the method, this is particularly important in neural network modelling. For further details on the method, we point to the review by Mackay [8]. Databases A complete description of the chemical composition and the transformation temperatures is required to ideally model the MS and BS temperatures in steels. For the BS temperature A literature survey [9-14] allowed us to collect 247 individual cases where detailed chemical composition and transformation temperatures were reported. Table I shows the list of 11 input variables used for the BS temperature analysis. Some of the alloys contain traces of elements such as P, S, N and B, which have not been included in the model. Table I. Variables that influence BS temperature in steels. Concentrations are in wt.%. C Si Mn Ni Cr Mo Cu Al V W Co BS(ºC) Min. 0.11 0 0 0 0 0 0 0 0 0 0 244.15 Max. 1.5 1.67 3.76 5.04 11.5 8 0.26 0.99 2.1 18.59 5 704 In relation with other models the range of compositions has been increased between 1-2 wt.% for C, Si, Mn and V, and more than 5 wt.% in the case of Cr, Mo and W. To the knowledge of the authors Al is included for the first time in a study of these characteristics. 870 For the MS temperature For the MS temperature data were obtained from a variety of sources [15-35]. This resulted in a database containing about 1200 entries and covering a wide variety of compositions as illustrated in Table II. Table II. Variables that influence MS temperature in steels. Concentrations are in wt.%. Element Min(wt%) Max(wt%) Element Min(wt%) Max(wt%) C 0.0 2.25 Co 0.0 30.0 Mn 0.0 10.24 Al 0.0 3.01 Si 0.0 3.8 W 0.0 18.59 Cr 0.0 17.98 Cu 0.0 3.04 Ni 0.0 31.54 Nb 0.0 1.98 Mo 0.0 8.0 Ti 0.0 2.52 V 0.0 4.55 B 0.0 0.06 N 0.0 2.65 As emphasised by Mackay [36], it is important to ensure that any knowledge about the system is somehow present in the database, or in the network structure. The assessment recently published by the present authors [4] illustrated the fact that the Ms temperature should be bounded between 0 and 1000 K. While this was naturally present in the thermodynamic approach of Ghosh and Olson [15,37-39], existing neural network models are not necessarily bounded [40,41], although as shown by Yescas et al. [42], it is possible to formulate the output in such a way that it has lower and upper limits. One way to incorporate this knowledge is to train the model using a function of the target which is naturally bounded in the desired interval [36, 42-44]. The present network was trained using y = ln(−ln (Ms/1000)), and therefore Ms = 1000 exp (−exp (y)) which is bounded between 0 and 1000. It is important to highlight the fact that in all the collected cases, for both temperatures, there were no interference of previous transformations or precipitations, meaning that austenite chemical composition are that reported for the bulk material. The procedure of training has been described numerous times in the literature, for example 7. In the present study, a commercial package [45] was used which implements the algorithm written by Mackay [43]. Results The performances of the models created were assessed on the sets of data unseen during training. Figure 4 shows the performance of both models in this dataset. 871 700 BS, predicted / ºC 600 500 400 300 200 200 300 400 500 600 700 750 900 BS, measured/ ºC 900 MS, predicted / K 750 600 450 300 150 150 300 450 600 MS, measured/ K Figure 4. Performance of both models on a test dataset of unseen data during training. Conclusions Using a large amount of published data, a neural network model has been trained to predict the Ms and BS temperatures of steels of a wide range of compositions. By using of a carefully selected function of Ms rather than Ms as the target, it was possible to put bounds on the output, therefore eliminating the risk of wild predictions. The models were shown to perform significantly well. The bayesian framework means that not only the knowledge present in the database is reflected in the model, but also the absence of it, as the model will produce large error bars for predictions where data were sparse during training. 872 Acknowledgements The authors acknowledge financial support from the Spanish Comision Interministerial de Ciencia y Tecnología (CICYT) (project-MAT 2001-1617). C. Garcia-Mateo and F.G. Caballero would like to thank Spanish Ministerio de Educación y Ciencia for the financial support in the form of Ramón y Cajal contracts (RyC 2002 and 2004 respectively). References 1. N. Saunders and A.P. Miodownik, CALPHAD Calculation of Phase Diagrams, A comprehensive Guide, ed. Pergamon Materials Series (1998). 2. G.B. Olson and M. Cohen, Metall. Trans., 7A (1976) 1897-1923. 3. H.K.D.H. Bhadeshia, Bainite in Steels, 2nd ed. Institute of Materials, London, (2001). 4. T. Sourmail, C. Garcia-Mateo “Critical assessment of models for predicting the Ms temperature of steels” Comp. Mater. Sci. in press. 5. T. Sourmail, H.K.D.H. Bhadeshia, and D.J.C. Mackay. Mater Scie. Techn., 18 (2002) 655-663. 6. D.J.C. Mackay. Information Theory, Inference, and Learning Algorithms. Cambridge University Press, Cambridge 2003. 7. H.K.D.H Bhadeshia. ISIJ Int., 39 (1999) 966-979. 8. D.J.C. Mackay.Network: Comput. Neural Syst. 6 (1995) 469-505. 9. Steven W and Haynes A G, JISI, 183 (1956) 349-359. 10. H.E.Boyer, “Atlas of Isothermal transformation and cooling transformation diagrams”. American Society of Metals. Metals Park, Ohio 44073. 1977. 11. Atlas of Isothermal transformation of B.S. En Steels. 2nd edition. Special report No. 56. The Iron and steel Institute. 4 Grosvenor gardens, London, SWI. 1956. 12. L.C. Chang. Metall. Trans. A., 30 (1999) 909-916. 13. C. Garcia-Mateo, F.G. Caballero and H.K.D.H. Bhadeshia. ISIJ International, 43 (2003) 1821-1825. 14. G.F. Vander Voort (ed) “Atlas of Time-temperature diagrams for Irons and Steels” ASM International, Metals Park, OH, 1991. 15. G.Ghosh, G.B. Olson, Acta Mat. 42 (1994) 3361-3370. 16. M.Atkins, Atlas of continuous cooling transformation diagrams for engineering steels, Tech. rep., British Steel Corporation. 17. M.Economopoulos, N.Lambert, L.Habraken, Diagrammes de transformation des aciers fabriqués dans le benelux, Tech. rep., Centre National de Recherches Métallurgiques (1967). 18. Atlas of isothermal transformation diagrams of b.s. en steels. special report no 40, Tech. rep., The British Iron and Steel research association (1949) 19. Atlas of isothermal transformation diagrams of b.s. en steels.(2nded) special report no 56, Tech. rep., The Iron and Steel Institute (1956) 20. Atlas of isothermal transformation and cooling transformation diagrams, Tech. rep., American Society for Metals (1977). 21. A.B. Greninger, Trans. ASM 30 (1942) 1-26. 22. T.G. Digges, Trans. ASM 28 (1940) 575-600. 23. T.Bell, W.S. Owen, JISI 205 (1967) 1777-1786. 24. K.Ishida, T.Nishizawa, Trans. JIM 15 (1974) 218-224. 25. M.Oka, H.Okamoto, Metall. Trans. A 19 (1988) 447-452. 26. J.S. Pascover, S.V. Radcliffe, Trans. AIME 242 (1968) 673-682. 27. R.B.G. Yeo, Trans AIME 227 (1963) 884-890. 28. A.S. Sastri, D.R.F. West, JISI 203 (1965) 138-145. 873 29. U.R. Lenel, B.R. Knott, Metal. Trans. A 18 (1987) 767-775. 30. W.Steven, A.G. Haynes, JISI 183 (1956) 349-359. 31. R.H. Goodenow, R.F. Heheman, Trans. AIME 233 (1965) 1777-1786. 32. R.A. Grange, H.M. Stewart, Trans. AIME 167 (1945) 467-494. 33. M.M. Rao, P.G. Winchell, Trans. AIME 239 (1967) 956-960. 34. P.Payson, C.H. Savage, Trans. ASM 33 (1944) 261-281. 35. E.S. Rowland, S.R. Lyle, Trans. ASM 37 (1946) 27-47. 36. D. J. C. Mackay, Bayesian non-linear modelling with neural networks, http://www.inference.phy.cam.ac.uk/mackay/cpi_short.pdf (1995). 37. G. Ghosh, G. B. Olson, Acta Mat. 42 (1994) 3371–3379. 38. G. Ghosh, G. B. Olson, J. Phase Eq. 22 (3) (2001) 199–207. 39. G. Ghosh, G. B. Olson, Acta Mat. 50 (2002) 2655–2675. 40. C. Capdevilla, F. G. caballero, C. G. de Andrés, I.S.I.J. 42 (2002) 894–902. 41. W. G. Vermeulen, P. F. Morris, A. P. de Weijer, S. van der Zwaag, Ironmaking and Steelmaking 23 (1996) 433–437. 42. M. A. Yescas-Gonzales, H. K. D. H. Bhadeshia, Mater. Sci. Eng. A 333 (2002) 60–66. 43. D. J. C. Mackay, Neural computation 4 (1992) 448-472. 44. D. J. C. Mackay, Neural computation 4 (1992) 698-714. 45. Model Manager, Neuromat Ltd, www.neuromat.com (2003). 874 View publication stats