Uploaded by workanywhere124

Nueral Networks

advertisement
See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/266967350
Neural network modelling for the prediction of bainite and martensite start
temperature in steels
Article · January 2005
CITATION
READS
1
333
10 authors, including:
Carlos Garcia-Mateo
Thomas Sourmail
Centro Nacional de Investigaciones Metalúrgicas (CENIM)
Ascometal CREAS - Swiss Steel Group
210 PUBLICATIONS 7,779 CITATIONS
89 PUBLICATIONS 3,381 CITATIONS
SEE PROFILE
SEE PROFILE
Francisca G. Caballero
Carlos Capdevila
Spanish National Research Council
Centro Nacional de Investigaciones Metalúrgicas (CENIM)
288 PUBLICATIONS 10,962 CITATIONS
259 PUBLICATIONS 5,153 CITATIONS
SEE PROFILE
All content following this page was uploaded by Carlos García de Andrés on 02 June 2015.
The user has requested enhancement of the downloaded file.
SEE PROFILE
Solid-to-Solid Phase Transformations in Inorganic Materials 2005
Volume 2: Displacive TransformationsyFundamentals of Phase TransformationsyExperimental Approaches to the Study of Phase TransformationsyComputer Approaches to Simulations of Phase TransformationsyPhase Transformations in Novel Systems or Special Materials
Edited by James M. Howe, David E. Laughlin, Jong K. Lee, Ulrich Dahem, and William A. Soffa
TMS (The Minerals, Metals & Materials Society), 2005
NEURAL NETWORK MODELLING FOR THE PREDICTION OF
BAINITE AND MARTENSITE START TEMPERATURE IN STEELS
C. Garcia-Mateo1, T. Sourmail2, F.G. Caballero1, C. Capdevila1 and C. García de Andrés1
1
CENIM-CSIC.Dept. of Physical Metallurgy, Solid-Solid Phase Transformations Group (GITFES).
Centro Nacional de Investigaciones Metalúrgicas (CENIM), Avda. Gregorio del Amo, 8, 28040 Madrid,
Spain.
2
Cambridge University, Dept. Materials Science and Metallurgy, Phase Transformation Group.
Pembroke St., Cambridge CB2 3QZ, UK.CGM71
Now at TWI Ltd.
Keywords: Bainite and martensite start temperature, neural network, Bayesian framework
Abstract
The significance of Bainite and Martensite start temperatures is unquestionable for the steel
community, therefore a significant amount of work has been devoted not only in understanding
the physical mechanism lying beneath both transformations, but also to obtaining quantitatively
accurate models for predicting both temperatures.
Nowadays, with no limitations in computer power, rigorous and complex data analysis methods
should be applied whenever it is possible. Thus, Neural Network analysis becomes a very
attractive alternative, for being easily distributed, self-sufficient and for its ability of
accompanying its predictions by an indication of their reliability
Introduction
The martensite and bainite start temperature, Ms and Bs, are defined as the highest temperature
at which austenite transforms to martensite and bainite respectively. Due to the excellent
combination of properties achieved by these microstructures and their wide range of industrial
applicability, there is an understandable and considerable industrial interest in being able to
predict reliably both temperatures.
The exact values of these temperatures strongly depend on the chemical composition of the steel,
and considerable work has been devoted to developing quantitative models for their
compositional dependency. This has long been done using linear regressions or similar methods.
We have classified these as non-adaptative because the `shape' of the function is pre-determined
by the authors rather than adapted to the data. Therefore such methods have very limited ranges
of applicability because of their inability to grasp interactions or non-linear effects. In contrast,
neural network methods, as discussed later, are adaptative functions. On the other hand with the
development of calculation frameworks such as CALPHAD(calculation of phase diagrams) [1],
which allow prediction of thermodynamic properties of complex systems from data collected on
simpler ones, more physically relevant approaches relying on the satisfaction of some
thermodynamic criterion have gained importance [2,3]. This approach allows a much wider
range of applicability than linear regression. Furthermore, the physical basis suggests that it
should extrapolate relatively safely unless the mechanisms taken into account change
significantly with composition, or the empirical thermodynamic data behave badly in
extrapolation. Still these approaches suffer of some limitations, i.e in those models some of the
thermodynamic criterions used for the calculation of Ms and Bs are model-dependent in the
867
sense that they are implicitly linked with the database that has been used during the derivation of
the function to express its compositional dependency. This becomes a problem if different
databases are used in deriving the criterion and in making predictions (or more exactly, if the
different databases describe similar systems differently). With the increasing number of
thermodynamics databases available, this problem cannot be neglected. In addition, the accuracy
of the model may be limited by that of the underlying thermodynamic database, therefore the
empirical component is not eliminated but displaced to lower levels of the model. Finally,
making predictions requires access to expensive thermodynamic calculation software and
databases.
New empirical method such as neural network analysis offer attractive advantages, being not
only easily distributed and self-sufficient but also able to cover arbitrarily large ranges of data.
As any other method, their domain of applicability is somewhat determined by the data available
at the time the model is defined. However, a feature unique to the method employed in the
present work is the ability of the model to accompany its predictions by an indication of their
reliability, this has been discussed further in ref 4. This method has been used with success to
address a variety of problems, see for example ref. 5
Neural network modelling
Method
Neural networks, in the present context, essentially refer to non-linear multiple regression tools
using adaptative functions. The following section will not detail the technique (see for example
[4,6,7]), but presents the main features of the method. The typical structure of a neural network is
presented in Figure 1.
Figure 1. The typical structure of a neural network as used for non-linear multiple regressions.
The first layer is made up by the inputs (1,.., xi), the second by so-called `hidden units' and the
last one is the output.
The hidden-units (the second-layer in Figure 1) take as input a weighted sum of the inputs and
return its hyperbolic tangent:
z j = tanh ∑ w ji xi
j
The third-layer combines these outputs using a linear superposition:
868
(1)
y = ∑ω j z j
(2)
j
where the wij and ωj are often referred to as the weights defining the network. ‘Training’ the
network implies identifying an optimal set of weights, given some data for which the output is
known. This is similar in principle to identifying the slope and intercept of the best fit line in a
linear regression. The fundamental difference between this type of regression and linear
regression is that neural networks correspond to adaptative functions. In traditional methods, the
author fixes the form of the equation (for example, a second degree polynomial), and identifies
the parameters that lead to optimal fitting of the observed data. Even in the few cases where the
authors take the trouble to assess more than one function (for example, to determine whether a
second or third degree polynomial is most appropriate), the extent to which the function is
adapted to the data is very limited.
With neural networks however, the complexity of the function is mainly controlled by the
weights themselves, so that the optimisation includes a determination of the most suitable shape
for the function. This flexibility is not without a drawback: overfitting is the cause of most
problems in neural network modelling. Overfitting occurs when an overly complex function is
chosen, so that the noise, rather than the trend in the data, is fitted by the function. One method
widely applied to limit overfitting is to perform the optimisation on only one part of the data,
then use the second part to determine which level of complexity best fits the data. This is
illustrated in Figure 2.
Figure 2. The problem of overfitting with neural networks can be avoided if only part of the data
is used to optimise the network (here the filled circle). At this stage, the best solution appears as
that which goes through all the filled circles. When using the second part of the dataset (crosses),
it becomes obvious, however, that this solution is strongly overfitted the real trend and the real
trend is better captured by a simpler model.
Bayesian/Classical framework
There are two ways to understand regression, whether in the context of neural networks or of
linear, polynomial, etc. regression. The first and still most often encountered consists in defining
an error function and minimising it by adjusting the parameters. We will refer to this method as
the classical method.
Bayesian probabilities offer a far more interesting approach by which the final model not only
encompasses the knowledge present in the data, but also an estimation of the uncertainty on this
knowledge.
869
Rather than identifying optimum parameters, an optimum probability distribution of parameters
values is fitted to the data. In regions of space where data are sparce, this distribution will be
wide, indicating that a number of solutions could fit the problem with similar probabilities. If a
large amount of data is available, this distribution will be narrow indicating that one shape of
function is significantly more probable than any other.
Because it can be quantified, the uncertainty on the determination of the network parameters can
be translated into an uncertainty on the prediction. This is illustrated in Figure 3.
Figure 3. Illustration of the possibilities offered by Bayesian neural networks: the prediction can
be accompanied by an error bar related to the uncertainty of fitting. When data are sparse, the
uncertainty of fitting is larger than in region with sufficient data.
Whether for linear regression or neural network, a bayesian approach should always be preferred
because it allows predictions to be accompanied by an indication on the uncertainty. Because of
the flexibility of the method, this is particularly important in neural network modelling.
For further details on the method, we point to the review by Mackay [8].
Databases
A complete description of the chemical composition and the transformation temperatures is
required to ideally model the MS and BS temperatures in steels.
For the BS temperature
A literature survey [9-14] allowed us to collect 247 individual cases where detailed chemical
composition and transformation temperatures were reported.
Table I shows the list of 11 input variables used for the BS temperature analysis. Some of the
alloys contain traces of elements such as P, S, N and B, which have not been included in the
model.
Table I. Variables that influence BS temperature in steels. Concentrations are in wt.%.
C
Si
Mn Ni
Cr Mo Cu
Al
V
W
Co BS(ºC)
Min. 0.11
0
0
0
0
0
0
0
0
0
0 244.15
Max. 1.5 1.67 3.76 5.04 11.5 8 0.26 0.99 2.1 18.59 5
704
In relation with other models the range of compositions has been increased between 1-2 wt.% for
C, Si, Mn and V, and more than 5 wt.% in the case of Cr, Mo and W. To the knowledge of the
authors Al is included for the first time in a study of these characteristics.
870
For the MS temperature
For the MS temperature data were obtained from a variety of sources [15-35]. This resulted in a
database containing about 1200 entries and covering a wide variety of compositions as illustrated
in Table II.
Table II. Variables that influence MS temperature in steels. Concentrations are in wt.%.
Element Min(wt%) Max(wt%) Element Min(wt%) Max(wt%)
C
0.0
2.25
Co
0.0
30.0
Mn
0.0
10.24
Al
0.0
3.01
Si
0.0
3.8
W
0.0
18.59
Cr
0.0
17.98
Cu
0.0
3.04
Ni
0.0
31.54
Nb
0.0
1.98
Mo
0.0
8.0
Ti
0.0
2.52
V
0.0
4.55
B
0.0
0.06
N
0.0
2.65
As emphasised by Mackay [36], it is important to ensure that any knowledge about the system is
somehow present in the database, or in the network structure. The assessment recently published
by the present authors [4] illustrated the fact that the Ms temperature should be bounded between
0 and 1000 K. While this was naturally present in the thermodynamic approach of Ghosh and
Olson [15,37-39], existing neural network models are not necessarily bounded [40,41], although
as shown by Yescas et al. [42], it is possible to formulate the output in such a way that it has
lower and upper limits. One way to incorporate this knowledge is to train the model using a
function of the target which is naturally bounded in the desired interval [36, 42-44].
The present network was trained using y = ln(−ln (Ms/1000)), and therefore Ms = 1000 exp
(−exp (y)) which is bounded between 0 and 1000.
It is important to highlight the fact that in all the collected cases, for both temperatures, there
were no interference of previous transformations or precipitations, meaning that austenite
chemical composition are that reported for the bulk material.
The procedure of training has been described numerous times in the literature, for example 7. In
the present study, a commercial package [45] was used which implements the algorithm written
by Mackay [43].
Results
The performances of the models created were assessed on the sets of data unseen during training.
Figure 4 shows the performance of both models in this dataset.
871
700
BS, predicted / ºC
600
500
400
300
200
200
300
400
500
600
700
750
900
BS, measured/ ºC
900
MS, predicted / K
750
600
450
300
150
150
300
450
600
MS, measured/ K
Figure 4. Performance of both models on a test dataset of unseen data during training.
Conclusions
Using a large amount of published data, a neural network model has been trained to predict the
Ms and BS temperatures of steels of a wide range of compositions. By using of a carefully
selected function of Ms rather than Ms as the target, it was possible to put bounds on the output,
therefore eliminating the risk of wild predictions. The models were shown to perform
significantly well.
The bayesian framework means that not only the knowledge present in the database is reflected
in the model, but also the absence of it, as the model will produce large error bars for predictions
where data were sparse during training.
872
Acknowledgements
The authors acknowledge financial support from the Spanish Comision Interministerial de
Ciencia y Tecnología (CICYT) (project-MAT 2001-1617). C. Garcia-Mateo and F.G. Caballero
would like to thank Spanish Ministerio de Educación y Ciencia for the financial support in the
form of Ramón y Cajal contracts (RyC 2002 and 2004 respectively).
References
1. N. Saunders and A.P. Miodownik, CALPHAD Calculation of Phase Diagrams, A
comprehensive Guide, ed. Pergamon Materials Series (1998).
2. G.B. Olson and M. Cohen, Metall. Trans., 7A (1976) 1897-1923.
3. H.K.D.H. Bhadeshia, Bainite in Steels, 2nd ed. Institute of Materials, London, (2001).
4. T. Sourmail, C. Garcia-Mateo “Critical assessment of models for predicting the Ms
temperature of steels” Comp. Mater. Sci. in press.
5. T. Sourmail, H.K.D.H. Bhadeshia, and D.J.C. Mackay. Mater Scie. Techn., 18 (2002)
655-663.
6. D.J.C. Mackay. Information Theory, Inference, and Learning Algorithms. Cambridge
University Press, Cambridge 2003.
7. H.K.D.H Bhadeshia. ISIJ Int., 39 (1999) 966-979.
8. D.J.C. Mackay.Network: Comput. Neural Syst. 6 (1995) 469-505.
9. Steven W and Haynes A G, JISI, 183 (1956) 349-359.
10. H.E.Boyer, “Atlas of Isothermal transformation and cooling transformation diagrams”.
American Society of Metals. Metals Park, Ohio 44073. 1977.
11. Atlas of Isothermal transformation of B.S. En Steels. 2nd edition. Special report No. 56.
The Iron and steel Institute. 4 Grosvenor gardens, London, SWI. 1956.
12. L.C. Chang. Metall. Trans. A., 30 (1999) 909-916.
13. C. Garcia-Mateo, F.G. Caballero and H.K.D.H. Bhadeshia. ISIJ International, 43 (2003)
1821-1825.
14. G.F. Vander Voort (ed) “Atlas of Time-temperature diagrams for Irons and Steels” ASM
International, Metals Park, OH, 1991.
15. G.Ghosh, G.B. Olson, Acta Mat. 42 (1994) 3361-3370.
16. M.Atkins, Atlas of continuous cooling transformation diagrams for engineering steels,
Tech. rep., British Steel Corporation.
17. M.Economopoulos, N.Lambert, L.Habraken, Diagrammes de transformation des aciers
fabriqués dans le benelux, Tech. rep., Centre National de Recherches Métallurgiques
(1967).
18. Atlas of isothermal transformation diagrams of b.s. en steels. special report no 40, Tech.
rep., The British Iron and Steel research association (1949)
19. Atlas of isothermal transformation diagrams of b.s. en steels.(2nded) special report no 56,
Tech. rep., The Iron and Steel Institute (1956)
20. Atlas of isothermal transformation and cooling transformation diagrams, Tech. rep.,
American Society for Metals (1977).
21. A.B. Greninger, Trans. ASM 30 (1942) 1-26.
22. T.G. Digges, Trans. ASM 28 (1940) 575-600.
23. T.Bell, W.S. Owen, JISI 205 (1967) 1777-1786.
24. K.Ishida, T.Nishizawa, Trans. JIM 15 (1974) 218-224.
25. M.Oka, H.Okamoto, Metall. Trans. A 19 (1988) 447-452.
26. J.S. Pascover, S.V. Radcliffe, Trans. AIME 242 (1968) 673-682.
27. R.B.G. Yeo, Trans AIME 227 (1963) 884-890.
28. A.S. Sastri, D.R.F. West, JISI 203 (1965) 138-145.
873
29. U.R. Lenel, B.R. Knott, Metal. Trans. A 18 (1987) 767-775.
30. W.Steven, A.G. Haynes, JISI 183 (1956) 349-359.
31. R.H. Goodenow, R.F. Heheman, Trans. AIME 233 (1965) 1777-1786.
32. R.A. Grange, H.M. Stewart, Trans. AIME 167 (1945) 467-494.
33. M.M. Rao, P.G. Winchell, Trans. AIME 239 (1967) 956-960.
34. P.Payson, C.H. Savage, Trans. ASM 33 (1944) 261-281.
35. E.S. Rowland, S.R. Lyle, Trans. ASM 37 (1946) 27-47.
36. D. J. C. Mackay, Bayesian non-linear modelling with neural networks,
http://www.inference.phy.cam.ac.uk/mackay/cpi_short.pdf (1995).
37. G. Ghosh, G. B. Olson, Acta Mat. 42 (1994) 3371–3379.
38. G. Ghosh, G. B. Olson, J. Phase Eq. 22 (3) (2001) 199–207.
39. G. Ghosh, G. B. Olson, Acta Mat. 50 (2002) 2655–2675.
40. C. Capdevilla, F. G. caballero, C. G. de Andrés, I.S.I.J. 42 (2002) 894–902.
41. W. G. Vermeulen, P. F. Morris, A. P. de Weijer, S. van der Zwaag, Ironmaking and
Steelmaking 23 (1996) 433–437.
42. M. A. Yescas-Gonzales, H. K. D. H. Bhadeshia, Mater. Sci. Eng. A 333 (2002) 60–66.
43. D. J. C. Mackay, Neural computation 4 (1992) 448-472.
44. D. J. C. Mackay, Neural computation 4 (1992) 698-714.
45. Model Manager, Neuromat Ltd, www.neuromat.com (2003).
874
View publication stats
Download