PREPAID CHURN MODEL With Oracle Data Mining Necdet Deniz Halıcıoğlu deniz.halicioglu@turkcellteknoloji.com.tr September 21, 2010 Agenda About Turkcell Technology Churn Prediction Existing Mining System in Turkcell Data Mining with ODM SVM Model Conclusion Agenda About Turkcell Technology Churn Prediction Existing Mining System in Turkcell Data Mining with ODM SVM Model Conclusion About Turkcell Technology Turkcell Technology has more than 15 years of development experience with its solutions applied and proven at leading operators in more than 10 countries. More than 10 years of experience in Turkcell ICT 1994 - 2006 TTECH Center was put into service HC: 255 engineers Focus: Turkcell Group 2007 2008 TTECH was formed with 44 engineers in TÜBİTAK-MAM Technological Free Zone Focus: Turkcell Focus: Turkcell & Telia Sonera Group + Regional Sales HC: 360 engineers 2009 Focus: Turkcell & Telia Sonera Group HC: 321 engineers Today Areas of Competency From assisting the operation of network resources to improving business oriented intelligence, TTECH’s experts provide an expanding portfolio of packaged and custom solutions for telecom network operators. Network Services & Enablers SIM Asset & Services Management Mobile Marketing Mobile Internet & Multimedia Business Intelligence & Support Systems Turkcell Technology IMS Group More than 10 years of BI experience in Telecommunications industry Designed, Built and Running one of the largest data warehouses in telecom industry Team of more than 100 highly talented professionals and consultants Has a proven record of success in BI operations Flawless operation, providing data for finance and even for NYSE Early adopter of the new BI trends Complex Event Processing, Text Mining, etc. Agenda About Turkcell Technology Churn Prediction Existing Mining System in Turkcell Data Mining with ODM SVM Model Conclusion What Makes Churn Prediction So Crutial? Everybody faces the same difficulties… Competition Forming Customer Loyalty High cost of customer acquisition Optimizing budget for customer retention People don’t want to hear any more Basics of Churn Prediction Churn prediction starts with turning an abundance of data into valuable information and continues as a cyclic process. Preparation Data Define variable pool Perform mining ETL Preprocessing Attribute Importance Normalization Outlier Detection Missing Value Cleanup Action Mining Build Test Apply Information Success Criteria • Useless Action • Customer Annoyance • • Customer Loss 0/0 0/1 1/0 1/1 • Agenda About Turkcell Technology Churn Prediction Existing Mining System in Turkcell Data Mining with ODM SVM Model Conclusion Pain Points About Existing Mining System Too much manual effort: A new project for every new mining activity SAS licensing E-DWH DM-DWH SAS Server Not leading, but lagging the business Administrative overhead of distributed mining environment Network overhead Decoupled process monitoring Data quality problems End Users Approach in Existing Churn Model Attribute Selection with Human Expertise Replace the existing model with the best model for churn prediction Choose best model manually Perform ETL Build many models in serial with different • Algorithms • Hyperparameters Agenda About Turkcell Technology Churn Prediction Existing Mining System in Turkcell Data Mining with ODM SVM Model Conclusion Give a Try to Oracle Data Mining Motivations Building an automated mining framework based on our Oracle database experience instead of maintaining manual mining model cycle. No extra licensing cost (under ULA). High speed (close to real time) mining with database embedded mining. Centralized mining activity monitoring & administration. Our Proposal for Data Mining Framework Feed all customer attributes possible Externalize those models for APPLY Let AI to filter important ones Train Oracle SVM models with selected attributes Oracle Choosing Attributes with Attribute Importance --Perform EXPLAIN operation BEGIN DBMS_PREDICTIVE_ANALYTICS.EXPLAIN(data_table_name => 'census_dataset', explain_column_name => 'class', result_table_name => 'census_explain_result'); END; / --View results SELECT * FROM census_explain_result; COLUMN_NAME -------------IN_REF_NUMDAYSSINCELASTREFILL DT_SUB_ACTIVATIONDATE IN_MNP_PORTINTENURE NM_SUB_ACTIVATIONREASON IN_MNP_TCELL_TENURE . . . EXPLANATORY_VALUE ----------------.141200904 .028200303 .026178093 .025882544 .025279836 RANK ---1 2 3 4 5 Top 5 by AI Our Top 5 After AI Number of days since last refill Activation Date Port in Tenure Subscriber Activation Reason Subscription Period in Turkcell Build & Apply the SVM Model --Perform PREDICT operation DECLARE v_accuracy NUMBER(10,9); BEGIN DBMS_PREDICTIVE_ANALYTICS.PREDICT(accuracy => v_accuracy, data_table_name => 'census_dataset', case_id_column_name => 'person_id', target_column_name => 'class', result_table_name => 'census_predict_result'); DBMS_OUTPUT.PUT_LINE('Accuracy = ' || v_accuracy); END; / --View first 10 predictions SELECT * FROM census_predict_result WHERE rownum < 10; PERSON_ID ---------2 7 8 9 10 5 rows selected. PREDICTION ---------1 0 0 0 0 PROBABILITY ----------.418787003 .922977991 .99869723 .999999605 .9999009 Other Remarks on ODM No need to perform manual attribute processing in many cases EDP : Embedded data preparation ADP : Automatic data preparation PL/SQL or Java based code generation SAS to ORACLE model import •Eliminates data Movement •Eliminates data duplication •Preserves security Agenda About Turkcell Technology Churn Prediction Existing Mining System in Turkcell Data Mining with ODM SVM Model Conclusion Creating the Case Table Variable Pool (400 variables) PREPAID and INDIVIDUAL and (ACTIVE or MOC-BARRED) Filtered Variable Pool JOIN MONTH(N)=MONTH(N+1) CASE TABLE Historic Churn Table Building the SVM Model CASE TABLE •400 Attributes •Unique Identifier •Target Churn Value ATTRIBUTE IMPORTANCE CASE TABLE (180 ATTRIBUTES) FEB DATA MAR CHURN MAR DATA APR CHURN APR DATA MAY CHURN COMBINE DIFFERENT DATASETS BUILD SVM MODEL MAY DATA JUN CHURN ODM on Oracle Exadata v2 o Initially we have used a large Solaris (100+ UltraSparc 7 cores and 640 GB memory) box to build our first SVM models: • It took 29 hours to complete model build & apply. o On Exadata this reduces to a few hours. oMainly due to enormous improvement in data preparation stage. Agenda About Turkcell Technology Churn Prediction Existing Mining System in Turkcell Data Mining with ODM SVM Model Conclusion To Sum Up Churn prediction over various customer groups is and will be the focus of Turkcell Embedded data mining with ODM is Faster More Robust (due to stability of SVM algorithm) Easier to automate Easier to manage Thanks for his contribution Hüsnü Şensoy, VLDB Expert husnu.sensoy@globalmaksimum.com Data & Information Technologies To learn more on SVM theory Turkcell Technology Research and Development TÜBİTAK MAM Teknoloji Serbest Bölgesi Gebze – Kocaeli TURKEY ': +90 (262) 677 40 00 7 : +90 (262) 677 40 01 8 : www.turkcelltech.com THANK YOU!