P009-Use of data mining techniques for better insights of iron making processes at Tata Steel Team 1.Arunabh Bhattacharjee(Speaker) 2.Shambhu Tiwary 3.Ashish Chakravorty 1 Agenda Why do you need data mining? Data mining and business intelligence Business value Methodology Application Case Reduce NH3 in clean coke oven gas Conclusions Why do you need data mining? •Companies offer similar products & services using comparable technology •Proprietary technologies rapidly copied and breakthrough innovation is not always possible •Geographical location & protective regulation is not always an advantage What is then the key differentiating factor? How do you get the competitive advantage? Answer Fast ,smart and evidence based decision making Predictive analytics & data mining Business Intelligence is a set of technologies and processes that use data to understand and analyze business performance. Business Intelligence Predictive analytics encompasses statistical techniques , data mining models, text mining etc Predictive analytics Data Mining is the process of extracting valid, useful, unknown, and comprehensible information from data Predictive analytics is used interchangeably with data mining Term has become popular and used by most IT vendors Data Mining Business Intelligence & Analytics Optimization Predictive Modeling Analytics Forecasting/Exploration Competitive Advantage Statistical analysis Alerts Standard Reports Access & reporting Query Drill down Ad hoc Reports Standard Reports Degree of Intelligence Business value Complexity level increases with business value High Value, High Complexity Cross Industry Standard Process for Data Mining (CRISP-DM) Internationally recognized methodology Business Understanding Advantages Data Deployment Industry neutral Understanding Data Tool independent Data Evaluation Preparation Data Modeling Application Case Reduce NH3 in Clean coke oven gas Coke Oven Function of by Product Plant is to Coke Oven (C O ) Gas is used as a fuel gas Clean Coke Oven gas Remove impurities like tar, ammonia and • Coke Plant naphthalene etc. from the gas • Rest of the steel plant Gas to by product plant Oven Heating Chamber Clean gas Ammonia (NH3) is highly corrosive Schematic Layout of Coke by product plant BATTERY HEATING CHILLED WATER COOLING WATER GAS HOLDER BOOSTER EXH C O Gas ** PC DC ETP P. S. A.S. N.S. W. P. BOOSTER FL. LIQ. TAR DECANTER. CONDENSATE TANK AMMONIA STILL TAR NAPTHALENE STILL LIQUOR STORAGE PC DC Primary cooling and deep cooling P.T. Pt PS Pre scrubbing AS Ammonia scrubbing BIOLOGICAL OXIDATION TREATMENT PLANT NAPTHELENE VAPOUR INCINERATOR **TO FOUL GAS MAIN Ammonia Removal Circuit Rich Liquor C.O. GAS 1.20 gm/100 C.C 1 Pre Scrubbers 2 3 Ammonia Scrubbers SLT 30 C AMM STILL Rich Liquor tank Steam P S/L CHILLER SLF Stripped liquor flow SLT Stripped liquor temperature P 40 C Stripped Liquor Tank SLF 70 C 35 C P Stripped Liquor Incinerator 40 C 95 C S/L COOLER .005gm/100 C.C 70 C Rich/lean liquor heat exchanger Process Requirement-Key challenges? 1.Find out the range of parameters, which would keep ammonia in clean gas to below 40, by taking out one parameter one-by-one. 2.As a next step, take out PCDC temperature, in combination with scrubber temperatures (first T+GT1, then T+GT2, and finally T+GT3) and see the effect on other parameters. 3.Finally, take out T, GT1, GT2, and GT3, and see what should be range of the remaining parameters. Data Preparation For a more comprehensive analysis following key parameters were considered viz Gas scrubber temperatures(GT1,GT2,GT3) Gas temperature(T) after Primary Cum deep cooling (PCDC) Stripped liquor flow (m³/hr) Stripped liquor Conc. (mg/100cc) Stripped liquor Temp.(ºC) Ammonia in clean C.O. gas(mg/Nm³/hr) More than 2 years of data (FY13, FY15) have been used. Final subset of data was then treated for missing values, outliers etc. Multiple iterations . Maximum amount of time and effort was spent at this stage. Total volume of data =10000 Data Modeling Correlation Matrix Find the most GT1 GT2 GT3 NH3 SLC SLF SLT T GT1 1.00 0.91 0.75 0.53 0.05 -0.14 -0.11 0.65 GT2 0.91 1.00 0.86 0.43 0.03 -0.14 -0.05 0.47 GT3 0.75 0.86 1.00 0.39 0.01 -0.14 0.28 0.34 NH3 0.53 0.43 0.39 1.00 0.08 0.04 0.08 0.43 generalized rules SLC 0.05 0.03 0.01 0.08 1.00 -0.10 -0.13 0.00 using ANN for SLF -0.14 -0.14 -0.14 0.04 -0.10 1.00 -0.01 -0.05 NH3<40 SLT -0.11 -0.05 0.28 0.08 -0.13 -0.01 1.00 -0.12 T 0.65 0.47 0.34 0.43 0.00 -0.05 -0.12 1.00 important parameters impacting the NH3 in clean C O gas. Predict Correlation matrix shows that gas scrubber temperatures have a direct impact on ammonia in C O gas ANN Prediction Out of the many rules generated by the algorithm the rule which predicts the condition when NH3<40 is selected Enlarged View ANN Prediction 33-33.5 22-22.5 0.005-0.006 52-53 Prediction of Operating range at different conditions NH3<=40 T 22-22.5 GT1 30-30.5 GT2 29.5-30 GT3 29.5-30.5 SLF 52-55 T + GT1 off T GT1 GT2 28.5-29.5 GT3 29-30 SLF 54.0-55.0 T + GT2 off T GT1 30-30.5 GT2 GT3 29.5-30 SLF 52.0-53.0 T + GT3 off T GT1 30-30.5 GT2 29.5-30.0 GT3 SLF 52-54 SLC SLT SLC 0.005-0.006 SLT 33.5-34 NH3 10.0-40.0 SLC 0.005-0.006 SLT 33-33.5 NH3 10.0-40.0 SLC 0.005-0.006 SLT 33-33.5 NH3 10.0-40.0 GT1 off T 22-22.5 GT1 GT2 29.5-30.0 GT3 29.5-30.0 SLF 52-54 GT2 off T 22-22.5 GT1 30.0-30.5 GT2 GT3 29.5-30.5 SLF 52-54 GT3 off T 22-22.5 GT1 30-30.5 GT2 29.5-30.0 GT3 SLF 52-54 SLC 0.005-0.006 SLT 33-33.5 NH3 10.0-40.0 SLC 0.005-0.007 SLT 33-33.5 NH3 10.0-40.0 SLC 0.005-0.006 SLT 33-33.5 NH3 10.0-40.0 0.005-0.006 33-33.5 T off T GT1 GT2 GT3 SLF 30-30.5 29.5-30.0 29.5-30.5 52-54 SLC 0.005-0.006 SLT 33-33.5 NH3 10.0-40.0 Confirmation of effects Based on the data mining results Ammonia(mg/Nm³/hr) standard operating procedures(SOP) were revised which further strengthened our daily management Ammonia(mg/Nm³/hr) practices at shop floor Major shut down work at by product plant Conclusions In addition to the well known areas like marketing & sales, fraud detection etc. data mining can also be used in complex processes like that of iron and steel making. Data mining is a very intelligent technique to get meaningful insights from large volumes of data in just few seconds. Data mining can be a key differentiator in fast and evidence based decision making End of presentation Thank You!