Uploaded by Stats Work

Call Network Call detail records as new Big Data Source to predict Credit Scoring - Statswork

advertisement
Research paper
CALL NETWORK / CALL
DETAIL RECORDS AS NEW
BIG DATA SOURCE TO
PREDICT CREDIT SCORING
TAGSCredit scoring, Social network analysis, Profit measure, Mobile phone data, Big Data Scoring, Call Networks, Metadata
SERVICESResearch Planning | Data Collection | Semantic Annotation | Business Analytics | Bio Statistics | Econometrics
Copyright © 2019 Statswrok. All rights reserved
Big Data Scoring is a cloud-based credit
decision engine that helps banks, telecoms and
consumer lenders improve credit quality and
acceptance rates through the use of big data.
In Brief
The study demonstrates how including call
networks, in the context of positive credit
information, as a new Big Data source has added
value in terms of profit by applying a profit
measure and profit-based feature selection.
Copyright © 2019 Statswrok. All rights reserved
Research Planning | Data Collection | Semantic Annotation | Business Analytics | Bio Statistics | Econometrics
Introduction
Credit scoring is one of the ancient applications of analytics where
investors and financial institutions execute statistical analysis to
evaluate the affluence of potential borrowers to support them
decide whether or not to grant credit.
In 1956, Fair Isaac was found as one of the first analytical
companies contributing retail credit scoring facilities in the US.
It’s a well-known FICO score that has been used as an analytical
decision instrument by financial institutions, insurers, utility
companies and even employers.
Copyright © 2019 Statswrok. All rights reserved
Research Planning | Data Collection | Semantic Annotation | Business Analytics | Bio Statistics | Econometrics
Credit Scoring Models
Edward Altman developed a z-score model for bankruptcy prediction, which
is still used to this day in Bloomberg reports as a default risk benchmark.
Initially, these models were built using limited data and were based on simple
classification techniques such as linear programming, discriminant analysis
and logistic regression.
The significance of these retail and corporate credit scoring models further
increased due to numerous regulatory compliance guidelines such as the
Basel Accords and IFRS 9 which specify the inputs and outputs of a credit
scoring model together with how these models can be used to compute
provisions and capital buffers.
The most elementary handset passively engenders a vast amount of metadata
leaving behind a digital hint of the activity of its user.
Contd.
Copyright © 2019 Statswrok. All rights reserved
Research Planning | Data Collection | Semantic Annotation | Business Analytics | Bio Statistics | Econometrics
These metadata deliver information on when, how, from where
and with whom we connect. In the beginning, researchers
realized the possible of such data by uploading the following
software into submissive subjects’ phones through the Reality
Mining project of the MIT4.
They later expanded admittance to actual metadata directly from
mobile network providers, leading to larger-scale research and
higher analytical power. Several creativities have occurred, such
as the Data For Development (D4D) challenge prepared by
Orange, that delivered datasets to the research community for
projects associated with development.
Call Detail Records as New Big Data
Source to Predict Credit Scoring
In a current survey carried out by the World Bank, mobile phone data seemed at the uppermost position in the
Big Data used in SDG-related projects.
However, new sources of data present the chance to profile potential borrowers using a more comprehensive
representation of behaviour; they also offer an ethical challenge.
Mobile phone data, e.g., in the form of call detail records (CDR), allows constructing an extensive social network,
and using this information to profile repayment behavior can be seen as unfair to borrowers that could be
punished for their mobile cell phone behavior.
Contd.
Copyright © 2019 Statswrok. All rights reserved
Research Planning | Data Collection | Semantic Annotation | Business Analytics | Bio Statistics | Econometrics
More newly, the curiosity in using call networks as a new Big
Data source for credit scoring has increased power, e.g., with
Wei et al. expressing the potential value of credit scores gained
with networks and how planned tie-formation might affect these
scores.
Though especially fascinating concerning the Chinese
government’s idea for a social credit system, the study is only
hypothetical and is missing a significant experiential evaluation
of the planned models.
Additionally, recent press coverage on specialized smartphone
applications that assess people’s creditworthiness using the
vast amount of data created by their handsets designates the
potential of call networks as a substitute data source for credit
scoring.
COMPARISON BETWEEN THE
TRADITIONAL CDR ANALYSIS SOLUTIONS
HIGHLIGHTING THE ADVANTAGES AND THE
LIMITATIONS OF EACH
Copyright © 2019 Statswrok. All rights reserved
Research Planning | Data Collection | Semantic Annotation | Business Analytics | Bio Statistics | Econometrics
Copyright © 2019 Statswrok. All rights reserved
Research Planning | Data Collection | Semantic Annotation | Business Analytics | Bio Statistics | Econometrics
Statistical Limitations
CDRs are an excellent illustration of Big Data source that can be abstracted from their key persistence to
approximate socio-economic variables and populace mobility.
As they are not intended for this purpose, this means that an inevitable prejudice will always influence any
application based on these data.
If not correctly understood, this could lead to a severe misunderstanding of the results and eventually,
have damaging influences in misleading policy-makers.
1. Technical issues
2. Selection bias
3. Spatial bias
Copyright © 2019 Statswrok. All rights reserved
Research Planning | Data Collection | Semantic Annotation | Business Analytics | Bio Statistics | Econometrics
Copyright © 2019 Statswrok. All rights reserved
Data Privacy
To defend people’s confidentiality, phone data are anonymised continuously, i.e., all personal data such
as name, address, etc., are either removed from the database or substituted by a randomly produced
number to avoid documentation.
Data are then provided to a third party after a non-disclosure agreement was signed with the MNO.
The persistence of the deal is to prevent CDRs to be shared with another party and to define the
possibility of research questions that will be discovered with the data.
Both the anonymization technique and the NDA are hypothetical to reserve the security of users
privacy.
Research Planning | Data Collection | Semantic Annotation | Business Analytics | Bio Statistics | Econometrics
Conclusion
Compared to traditional data composed to calculate official statistics, they
are cost-effective and can deliver earlier or even near real-time insights.
They might also be used to test ideas and define future research
questions. Credit-scoring agencies and creditors continually check and
size new credit-scoring models.
The accessibility of “big data” could generate opportunities for creditors
who want to prospect, consumers, support new accounts, manage
customers and grow profits.
It is already vibrant that the mobile phone data used in this study is
prominent in the sense of ‘Volume’, ‘Velocity’, ‘Veracity’ and ‘Variety’.
Analysis of the data and the resultant well-performing models show that it
also has a positive effect for financial inclusion and on model profit, and
as such is also essential for ‘Value’: the fifth V of Big Data!
Copyright © 2019 Statswrok. All rights reserved
Research Planning | Data Collection | Semantic Annotation | Business Analytics | Bio Statistics | Econometrics
Copyright © 2019 Statswrok. All rights reserved
Contact Us
Email Address
Get In Touch
With Us
Freelancer
[email protected]
Consultant
Phone Number
INDIA: +91-4448137070
UK: +44-1143520021
Guest Blog Editor
Email Address
[email protected]
Research Planning | Data Collection | Semantic Annotation | Business Analytics | Bio Statistics | Econometrics
Copyright © 2019 Statswork. All rights reserved
Download