Research paper CALL NETWORK / CALL DETAIL RECORDS AS NEW BIG DATA SOURCE TO PREDICT CREDIT SCORING TAGSCredit scoring, Social network analysis, Profit measure, Mobile phone data, Big Data Scoring, Call Networks, Metadata SERVICESResearch Planning | Data Collection | Semantic Annotation | Business Analytics | Bio Statistics | Econometrics Copyright © 2019 Statswrok. All rights reserved Big Data Scoring is a cloud-based credit decision engine that helps banks, telecoms and consumer lenders improve credit quality and acceptance rates through the use of big data. In Brief The study demonstrates how including call networks, in the context of positive credit information, as a new Big Data source has added value in terms of profit by applying a profit measure and profit-based feature selection. Copyright © 2019 Statswrok. All rights reserved Research Planning | Data Collection | Semantic Annotation | Business Analytics | Bio Statistics | Econometrics Introduction Credit scoring is one of the ancient applications of analytics where investors and financial institutions execute statistical analysis to evaluate the affluence of potential borrowers to support them decide whether or not to grant credit. In 1956, Fair Isaac was found as one of the first analytical companies contributing retail credit scoring facilities in the US. It’s a well-known FICO score that has been used as an analytical decision instrument by financial institutions, insurers, utility companies and even employers. Copyright © 2019 Statswrok. All rights reserved Research Planning | Data Collection | Semantic Annotation | Business Analytics | Bio Statistics | Econometrics Credit Scoring Models Edward Altman developed a z-score model for bankruptcy prediction, which is still used to this day in Bloomberg reports as a default risk benchmark. Initially, these models were built using limited data and were based on simple classification techniques such as linear programming, discriminant analysis and logistic regression. The significance of these retail and corporate credit scoring models further increased due to numerous regulatory compliance guidelines such as the Basel Accords and IFRS 9 which specify the inputs and outputs of a credit scoring model together with how these models can be used to compute provisions and capital buffers. The most elementary handset passively engenders a vast amount of metadata leaving behind a digital hint of the activity of its user. Contd. Copyright © 2019 Statswrok. All rights reserved Research Planning | Data Collection | Semantic Annotation | Business Analytics | Bio Statistics | Econometrics These metadata deliver information on when, how, from where and with whom we connect. In the beginning, researchers realized the possible of such data by uploading the following software into submissive subjects’ phones through the Reality Mining project of the MIT4. They later expanded admittance to actual metadata directly from mobile network providers, leading to larger-scale research and higher analytical power. Several creativities have occurred, such as the Data For Development (D4D) challenge prepared by Orange, that delivered datasets to the research community for projects associated with development. Call Detail Records as New Big Data Source to Predict Credit Scoring In a current survey carried out by the World Bank, mobile phone data seemed at the uppermost position in the Big Data used in SDG-related projects. However, new sources of data present the chance to profile potential borrowers using a more comprehensive representation of behaviour; they also offer an ethical challenge. Mobile phone data, e.g., in the form of call detail records (CDR), allows constructing an extensive social network, and using this information to profile repayment behavior can be seen as unfair to borrowers that could be punished for their mobile cell phone behavior. Contd. Copyright © 2019 Statswrok. All rights reserved Research Planning | Data Collection | Semantic Annotation | Business Analytics | Bio Statistics | Econometrics More newly, the curiosity in using call networks as a new Big Data source for credit scoring has increased power, e.g., with Wei et al. expressing the potential value of credit scores gained with networks and how planned tie-formation might affect these scores. Though especially fascinating concerning the Chinese government’s idea for a social credit system, the study is only hypothetical and is missing a significant experiential evaluation of the planned models. Additionally, recent press coverage on specialized smartphone applications that assess people’s creditworthiness using the vast amount of data created by their handsets designates the potential of call networks as a substitute data source for credit scoring. COMPARISON BETWEEN THE TRADITIONAL CDR ANALYSIS SOLUTIONS HIGHLIGHTING THE ADVANTAGES AND THE LIMITATIONS OF EACH Copyright © 2019 Statswrok. All rights reserved Research Planning | Data Collection | Semantic Annotation | Business Analytics | Bio Statistics | Econometrics Copyright © 2019 Statswrok. All rights reserved Research Planning | Data Collection | Semantic Annotation | Business Analytics | Bio Statistics | Econometrics Statistical Limitations CDRs are an excellent illustration of Big Data source that can be abstracted from their key persistence to approximate socio-economic variables and populace mobility. As they are not intended for this purpose, this means that an inevitable prejudice will always influence any application based on these data. If not correctly understood, this could lead to a severe misunderstanding of the results and eventually, have damaging influences in misleading policy-makers. 1. Technical issues 2. Selection bias 3. Spatial bias Copyright © 2019 Statswrok. All rights reserved Research Planning | Data Collection | Semantic Annotation | Business Analytics | Bio Statistics | Econometrics Copyright © 2019 Statswrok. All rights reserved Data Privacy To defend people’s confidentiality, phone data are anonymised continuously, i.e., all personal data such as name, address, etc., are either removed from the database or substituted by a randomly produced number to avoid documentation. Data are then provided to a third party after a non-disclosure agreement was signed with the MNO. The persistence of the deal is to prevent CDRs to be shared with another party and to define the possibility of research questions that will be discovered with the data. Both the anonymization technique and the NDA are hypothetical to reserve the security of users privacy. Research Planning | Data Collection | Semantic Annotation | Business Analytics | Bio Statistics | Econometrics Conclusion Compared to traditional data composed to calculate official statistics, they are cost-effective and can deliver earlier or even near real-time insights. They might also be used to test ideas and define future research questions. Credit-scoring agencies and creditors continually check and size new credit-scoring models. The accessibility of “big data” could generate opportunities for creditors who want to prospect, consumers, support new accounts, manage customers and grow profits. It is already vibrant that the mobile phone data used in this study is prominent in the sense of ‘Volume’, ‘Velocity’, ‘Veracity’ and ‘Variety’. Analysis of the data and the resultant well-performing models show that it also has a positive effect for financial inclusion and on model profit, and as such is also essential for ‘Value’: the fifth V of Big Data! Copyright © 2019 Statswrok. All rights reserved Research Planning | Data Collection | Semantic Annotation | Business Analytics | Bio Statistics | Econometrics Copyright © 2019 Statswrok. All rights reserved Contact Us Email Address Get In Touch With Us Freelancer info@statswork.com Consultant Phone Number INDIA: +91-4448137070 UK: +44-1143520021 Guest Blog Editor Email Address hr@workfoster.com Research Planning | Data Collection | Semantic Annotation | Business Analytics | Bio Statistics | Econometrics Copyright © 2019 Statswork. All rights reserved