PAKDD 2006 Data Mining Competition

advertisement
PAKDD 2006 Data Mining Competition
Singapore
Submitted by:
TUL3
BHARATH KONDURU
SAURBH PAWAR
SAYLEE SAKALIKAR
VIJAY GHUGE
DATE: Wednesday, March 1, 2006
Executive Summary
Data mining is a nontrivial extraction of implicit, previously unknown, and potentially
useful information from data. It is the science of extracting the useful information from
the large data sets or databases. It is usually associated with a business or other
organization’s need to identify trends.
An Asian Telco operator has successfully launched a third generation (3G) mobile
telecommunications network and by using the existing customer usage and demographic
data, it would like to identify which customers are likely to switch to using their 3G
network. The objective of this analysis is to accurately predict as many current 3G
customers as possible. SAS Enterprise Miner 4.3 was used to build the model and
perform the analysis.
In the task the group was presented with Training data set and Scoring data set. The
training data had a list of customers whose customer type either 2G/3G was known. The
first step began with the observation of the data set. On analysis of the data we found out
the variables which would influence the decision of the customer for his/her transition
from 2G to 3G technology based on correlation, intuition, technical facts and managerial
judgment. On the basis of above criteria the group selected 62 variables from the data set
of 252 variables to reduce the redundancy.
For the observation of the missing values and distribution, an Insight Node was added
and found that some variables in the data need replacement and transformation. Also an
exhaustive scrutiny of the output of the insight node showed that the dataset was biased
towards 2G customers. Thus to make an unbiased model the team sampled the data. The
data was then portioned into Training, Validation and Testing data sets. The split chosen
for partition was 65, 25 and 10% respectively of the Training Sample Data of 24000
network customers. Log transformation was done on variables having skewed
distribution. Variables not having fair intervals of distributions were binned. Variables
showing unknown values were replaced by default constant ‘None’. Tree imputation was
performed to replace the missing values of all other variables.
1
After performing the above said modifications to improve the predictivity of the model
the team started building the model with different combinations of algorithms. The
various models built for prediction are as follows
1. Logistic Regression
2. Decision Tree
3. Artificial Neural Network
4. Ensemble model
Of all the above models Decision Tree gave best prediction of 3G customers. The
selection of the best model was done based on Sensitivity and Misclassification rates
obtained for all three data sets. The observation of these parameters for the Decision Tree
Model is shown below
Model
Decision
Tree
Training
Validation
Testing
%MISC.
%MISC.
%MISC
%Sensitivity
Rate
%Sensitivity
Rate
%Sensitivity
Rate
79.52
21.57
79.09
21.47
77.85
24.33
Table1: Sensitivity and Misclassification rate of Gini 3 Way (Best Model)
The Decision Tree Model selected the following variables HS_AGE, AVG_NO_CALLED, AVG_VAS_GAMES, NUMBER OF DAYS SINCE
THE LAST RECEIVED RETENTION CAMP, TOP 2 INTERNATIONAL COUNTRY,
AVG_BILL_VOICED,
AVG_VAS_SMS,
AVG_MINS_MOB,
AVG_MINS_IB,
STD_NO_CALLED, LINE_TENURE, are the most important factors to predict the
current 3G customers.
The data set from the Decision Tree Model was then scored using the Scoring Data Set to
get the predicted serial numbers (IDs) of 3G customers
2
TABLE OF CONTENTS
1.0
2.0
Introduction
…..…………………………………..4
1.1
Data Insight
……………………………………... 4
1.2
Sampling
………………………………………4
1.3
Data Partition
………………………………………4
1.4
Missing Value Replacement ………………………………………5
1.5
Variable Transformation
………………………………………5
Model Development
………………………………………6
2.1
Logistic Regression
………………………………............6
2.2
Decision Tree
………………………………………7
2.21 Splitting Criterion
………..……………………………..8
2.3
Artificial Neural Network
…………………………………….. 11
2.4
Ensemble Model
…………………………………….. 13
3.0
Scoring
……………………………………...15
4.0
Conclusion
.……………………………………..16
5.0
References
……………………………………...17
6.0
Appendix
……………………………………….18
3
1.0 Introduction
The team conducted the analysis of the data set provided by the Asian telco operator.
SAS Enterprise Miner 4.3 was used for the analysis purpose. Different prediction models
were built to find accurately 3G customers. Detailed report of the team findings and the
models and algorithms used for this prediction are provided below.
1.1 Data Insight
The data set provided to the team for the analysis contained 24000 records of customers
with nearly 250 variables. The data set provided information of the Top 3 countries
called, customer class, different types of calls made, services utilized like games, tunes,
SMS, GPRS etc. Many variables showing redundant information were rejected manually.
The variables which were set with model role as input are shown in the Appendix 2. The
observed data set was biased, had missing values and some variables did not have normal
distribution. So the variables were modified using the following.
1.2 Sampling
As the data set was fairly large, sampling of data set was required to teach the model to
make unbiased predictions. So the team used the sampling node to make balanced
prediction of both 2G and 3G customers. Of the 18000 records of customers in training
data set 3000 were of 3G customers. Thus to get an equal prediction of 2G and 3G
customers the sample size was taken as 6000. Stratified sampling method was used for
the same.
1.3 Data Partition
The data set thus created was partitioned into training, validation and testing data sets. A
split of 65, 25, and 10 was used respectively. The method used was simple random with a
seed of 8108. This resulted in 3900 records in training, 1500 records in validation and
600 records in testing.
4
1.4 Missing Value Replacement
As the data set contained many missing values, replacement was needed. The variables
TOP 1 INTERNATIONAL COUNTRY, TOP 2 INTERNATIONAL COUNTRY,
TOP3_INT_CD, MARITAL_STATUS, OCCUP_CD, had unknown values which were
replaced with the default constant ‘None’. For all other variables Tree imputation was
chosen for missing value replacement. Using the default tab the imputed indicator
variables were created for missing values. The missing values of class and interval
variables were replaced by Tree Imputation Method.
1.5 Variable Transformation:
Considering the normality of the distribution of variables which were either
positive/negative skewed or spiked at one end, some variables were identified for
transformation. The transformations performed on these variables were Log
transformation and binning. An example of Log transformation of AVG_NO_CALLED
is shown below
Figure 1: Log Transformation of AVG_NO_CALLED
5
Example of binning of the variable STD_MINS is shown below:
Figure 2: Binning of STD_MINS
2.0 Model Development:
The Data Set was analyzed by using statistical models provided by SAS Enterprise
Miner. The various models built by the team were (see Appendix 1)

Logistic Regression

Decision Tree

Artificial Neural Network

Ensemble Model
2.1 Logistic Regression:
The binary nature i.e. if a customer is 2G/3G of the target variable is the driving force for
consideration of Logistic Regression. Logistic Regression describes the relationship
between categorical response variables which can be binary, ordinal or nominal and the
set of predictor variables. The Logistic Regression fits an ‘S’ shaped curve which better
captures the binary nature. The graph below shows the most vital variables on the basis of
Effect T-Scores. [1][2][3]
The
variables
include
HS_AGE,
DAYS_TO
_CONTRACT_EXPIRY,
STD_VAS_GAMES, NUMBER OF DAYS SINCE THE LAST RECEIVED
RETENTION CAMP, AVG_VAS_SMS.
6
Figure3: Most Important Variables influencing the predictivity
The % sensitivity and % misclassification rate of the model is shown in the common
table below
2.2 Decision Tree:
Decision trees partition large amounts of data into smaller segments by applying a series
of rules. Creating and evaluating decision trees benefits greatly from visualization of the
trees and diagnostic measures of their effectiveness. Decision Trees discover unexpected
relationships identify subdued relationship and can be used for categorical and
continuous data, accommodate missing data. Decision tree helps us to visualize the tree at
various levels of details and also help to examine diagnostic plots and statistics. For
appropriate assessment of efforts of the model, the following specific values were set in
the tree node. [4][5]
7
2.21 Splitting Criteria:
On trying all the three purity measures viz Chi Square, Entropy Reduction, Gini
Reduction, the latter was selected based on the following results:
Model
Chi-Sq 2 Way
Chi-Sq 3 Way
Entropy 2 Way
Entropy 2 Way
Gini 2 Way
Gini 3 Way
Training
%
%Misc.
Sensitivity
Rate
80.91
22.38
78.37
21.28
81.17
21.59
79.2
20.49
78.99
20.69
79.52
21.57
Validation
%
%Misc.
Sensitivity
Rate
80.05
23.4
77.2
21
77.97
23.53
77.72
21.67
78.23
21.47
79.09
21.47
Testing
%
%Misc.
Sensitivity
Rate
77.33
24.83
76
25.5
76.67
26
76..33
25.16
76.33
26.16
77.85
24.33
The minimum number of observations in a leaf, if too low, overfits the training data set
and if too high, underfits the training data set and thus misses the relevant patterns in the
data and hence the team chose 20 as the value for this parameter. The observations
required for a split search parameter controls the depth of the tree which must be no less
than twice the value of minimum number of observations in a leaf parameter and hence it
was decided as 100. Maximum depth of the tree was chosen as 6 for detailed analysis.
The trade off was made considering the consistency with the sensitivity and
misclassification rate. Also Chi 2 Way and Entropy 2 Way have better sensitivity, do not
show consistency in misclassification rate and the final output.
Important Variables:
1. HS_AGE
2. AVG_MINS_MOB
3. LST_RETENTION_CAMP
4. TOP1_INT_CD
5. TOP2_INT_CD
6. AVG_VAS_GAMES
7. AVG_NO_CALLED
8
With the above said options and important variables, the rules obtained for this decision
tree that give us majority of 3G responses are given below.
1. If Hand Set Age in months < 0.8958
2. If Hand Set Age in months < 2.1383 and Number Of Days Since The Last
Received Retention Camp < 0.135
3. If Hand Set Age in months < 2.1383 and Number Of Days Since The Last
Received Retention Camp < 0.095 and Average number of mobile calls in last 6
months >= 5.5490
4. If Hand Set Age in months < 2.1383 and Number Of Days Since The Last
Received Retention Camp < 0.095 and Average number of mobile calls in last 6
months < 5.5490 and Average games utilization(KB) in last 6 months = 2.48E6High
5. If Hand Set Age in months < 2.1383 and Number Of Days Since The Last
Received Retention Camp < 0.095 and Average number of mobile calls in last 6
months < 5.5490 and Average games utilization(KB) in last 6 months = Low472078 and Top 1 International Country = 29
6. If Hand Set Age in months >= 2.1383 and Average number of mobile calls in last
6 months > 4.6793 and Top 2 International Country = 258
7. If Hand Set Age in months >= 2.1383 and Average number of mobile calls in last
6 months > 4.6793 and Top 2 International Country = None and Top 1
International Country = 25
8. If Hand Set Age in months >= 2.1383 and Average number of mobile calls in last
6 months > 4.6793 and Top 2 International Country = None and Top 1
International Country = None and Average number of different numbers called in
last 6 months >= 3.686
9
The number of leaves chosen by SAS for pruning was 18.
Figure 4: Number of Leaves Pruned
The % sensitivity and % misclassification rate of the model is shown in the comparison
table below (Refer Table 2).
10
2.3 Artificial Neural Network:
An artificial neural network is a network of many simple processors ("units"), each
possibly having a small amount of local memory. The units are connected by
communication channels ("connections") that usually carry numeric (as opposed to
symbolic) data encoded by various means. The units operate only on their local data and
on the inputs they receive via the connections. The restriction to local operations is often
relaxed during training. [5][6][7][8]
Artificial Neural Networks are a class of flexible, nonlinear regression models,
discriminant models, and data reduction models that are interconnected in a nonlinear
dynamic system. Neural networks are useful tools for interrogating increasing volumes of
data and for learning from examples to find patterns in data. The large number of records
(24000) of customers and the variables (252) affecting them in the given data set led to
complex nonlinear relationships in data. So, neural network was built to make accurate
predictions about this data mining problem.
The team built artificial neural network model with different architectures namely

Multilayer Perception (MLP)

Ordinary Radial Basis Function (RBF) with equal widths

Ordinary Radial Basis Function (RBF) with unequal widths
All the above three different architectures of neural network model were tried with
different options in the neural network node of SAS Enterprise Miner. Based on the %
sensitivity and % misclassification rate for the above three models, the team picked the
neural network model with the architecture ‘Ordinary RBF with equal widths’.
Model
ANN MLP
ANN RBF Eq
ANN RBF
UnEq
Training
%
%Misc.
Sensitivity
Rate
71.27
23.15
71.02
23
70.23
24.41
Validation
%
%Misc.
Sensitivity
Rate
68.91
24.33
66.23
24.94
69.17
24.6
Testing
%
%Misc.
Sensitivity
Rate
67.33
26.86
67.79
26
68
26.66
Table 2: Sensitivity and Misclassification Rates of Various ANN models
11
In this model, the model selection criteria used was ‘Misclassification Rate’. Number of
hidden neurons selected was 3. Randomization was preferred for scale estimates, target
weights and target bias weights.
The model showed the following plot with the important variables and the associated
weights corresponding to each hidden neuron.
Figure5: important variables and the associated weights corresponding to each
hidden neuron.
The % sensitivity and % misclassification rate of the model is shown in the comparison
table below (Refer Table 2).
12
2.4 Ensemble Model
The Ensemble node creates a new model by averaging the posterior probabilities (for
class targets) or the predicted values (for interval targets) from multiple models. The new
model is then used to score new data. The component models are then integrated by the
Ensemble node to form a potentially stronger solution. The component models which
were integrated to form the ensemble model were Logistic Regression, Decision Tree
model with Gini 3 Way as purity measures and Artificial Neural Network with the
architecture RBF Equal widths. [1]
The % sensitivity and % misclassification rate of the model is shown in the common
table below.
Comparison of all models built.
Model
Training
%
Sensitivity
Validation
Testing
%
%
%
%
Misc
Misc
Sensitivity
Sensitivity
Rate
Rate
%
Misc
Rate
LR
77.94
19.92
72.6
23.53
72.48
25
Decision
Tree
ANN
79.52
71.02
21.57
23
79.09
66.23
21.47
24.94
77.85
67.79
24.33
26
Ensemble
Model
79.25
49
78.93
51
73.2
51
Table 2: Comparison showing Sensitivity and Misclassification Rates of all models
From the above comparison table, it can be seen that the % sensitivity and %
misclassification rate of the Decision Tree model is better than all the other models and
also the consistency of the decision tree over Training, Validation and Testing is better
than all the other models. Also False Positive for the Decision Tree Model is 20.63% and
can be seen from the output below. This means that 625 customers which are 3G were
wrongly predicted as 2G which is least than all other models.
13
The FREQ Procedure
Table of CUSTOMER_TYPE by I_CUSTOMER_TYPE
CUSTOMER_TYPE (CUSTOMER_TYPE)
I_CUSTOMER_TYPE (Into: CUSTOMER_TYPE)
Frequency‚
Row Pct ‚2G
‚3G
‚
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
2G
‚ 12546 ‚
2454 ‚
‚ 83.64 ‚ 16.36 ‚
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
3G
‚
625 ‚
2375 ‚
‚ 20.83 ‚ 79.17 ‚
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Total
13171
4829
Total
15000
3000
18000
Lift Chart for all models:
On the lift chart, it can be observed that all the models except the Ensemble model are
performing almost similar in all the deciles. So, the selection of the model for final
prediction required considering a trade-off between the lift chart result and the
% sensitivity and % misclassification rate results seen in the comparison table.
14
Eventually, Decision Tree was selected as the final model for scoring as it showed better
% sensitivity and % misclassification rate than the other models.
3.0 Scoring:
Scoring is used to generate and manage the predicted values from the trained model using
the formulae for prediction and assessment. For this problem, it was applied to a new
“holdout” sample data set provided in order to predict if each record would yield a 2G or
3G customer. For scoring the data set from the decision tree model selected, a score node
was attached to the tree node, an input data source node with the “holdout” sample
provided was attached to the score node and the score node was ran to perform the action
‘Apply training data score code to score data set’ when the path was ran. Then, an insight
node was attached to the score node and ran to get the results which were then saved in
the selected library and then opened with ‘Analyst’ in SAS to get the table for the
prediction of 2G and 3G customers, which is as follows:
The FREQ Procedure
Table of CUSTOMER_TYPE by I_CUSTOMER_TYPE
CUSTOMER_TYPE(CUSTOMER_TYPE)
I_CUSTOMER_TYPE(Into: CUSTOMER_TYPE)
Frequency‚
Row Pct ‚2G
‚3G
‚
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
2G
‚ 12546 ‚
2454 ‚
‚ 83.64 ‚ 16.36 ‚
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
3G
‚
625 ‚
2375 ‚
‚ 20.83 ‚ 79.17 ‚
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Total
13171
4829
Total
15000
3000
18000
Also, the file opened in SAS Analyst was saved as a delimited file keeping only two
fields SERAIL_NUMBER (ID Field) and I_CUSTOMER_TYPE (Prediction for the
target field ‘Customer Type’ (2G/3G) from the holdout sample) and deleting all other
fields.
15
4.0 Conclusion:
Thus based on the above findings we can conclude that the Decision Tree with Chi 3
Way purity measures is the best model with consistent results. Though we had to make
few trade offs with respect to sensitivity and misclassification the number of 3G
customers 1423 current or prospective was optimum as compared to other models.
Thus this SAS Singapore Data Mining Competition gave us a feel of the real world data
and gave us an opportunity to work on real time data. This experience has helped to
bridge the gap between theory and practice.
Financial Interpretations:
Assumptions:
Cost of sending coupons per person = $1
Total number of customers = 6000
Revenue per product purchased by the customer = $10
Cost of mailing the 6000 customers = 6000* 1 = $ 6000
When coupons were sent to total number of customers and we know that only 1000
respond from them then profit derived from it is
Profit = 1000*$10 - 6000 = $4000 Thus, we land up with a profit of $4000.
But by using the Decision Tree model which predicts 1423 potential 3G customers and
assuming that model is 100% accurate in predicting the actual 1000 customers, then
mailing cost to 1423 customers = $1423.
Profit = 1000*$10 - $1423 = $8577
Thus, we land up with a profit of $8577.
This is for a dataset of 6000 customers. For a dataset with a million customer records the
profits will be humongous.
5.0 References:
1. www.ats.ucla.edu/stat/ sas/topics/logistic_regression.htm
2. www.indiana.edu/~statmath/stat/all/cat/1b1.htm
3. luna.cas.usf.edu/~mbrannic/ files/regression/Logistic.html
16
4. www.sas.com/technologies/analytics/ datamining/miner/dec_trees.html
5. www.sas.com/offices/asiapacific/ sp/training/courses/dmdt.html
6. citeseer.ist.psu.edu/36580.htm
7. www.sas.com/technologies/analytics/ datamining/miner/neuralnet.html
8. www.sas.com/offices/asiapacific/ sp/training/courses/dmnn.html
9. dimacs.rutgers.edu/Workshops/ AdverseEvent/slides/stultz.ppt
6.0 Appendix: 1
17
Appendix: 2 Variables selected manually
18
Name
Model
Role
Measurement
Type
SERIAL_NUMBER
id
nominal
char
$4.00
$4.00
AGE
input
interval
num
BEST12.
12
AGE
AVG_BILL_AMT
input
interval
num
BEST12.
12
AVG_BILL_AMT
AVG_BILL_VOICED
input
interval
num
BEST12.
12
AVG_BILL_VOICED
AVG_CALL
input
interval
num
BEST12.
12
AVG_CALL
AVG_M2M_CALL_RATIO
input
interval
num
BEST12.
12
AVG_M2M_CALL_RATIO
AVG_MINS
input
interval
num
BEST12.
12
AVG_MINS
AVG_MINS_IB
input
interval
num
BEST12.
12
AVG_MINS_IB
AVG_MINS_MOB
input
interval
num
BEST12.
12
AVG_MINS_MOB
AVG_NO_CALLED
input
interval
num
BEST12.
12
AVG_NO_CALLED
AVG_OP_MINS_RATIO
input
interval
num
BEST12.
12
AVG_OP_MINS_RATIO
AVG_PK_CALL_RATIO
input
interval
num
BEST12.
12
AVG_PK_CALL_RATIO
AVG_T1_CALL_CON
input
interval
num
BEST12.
12
AVG_T1_CALL_CON
AVG_VAS_GAMES
input
interval
num
BEST12.
12
AVG_VAS_GAMES
AVG_VAS_QG
input
interval
num
BEST12.
12
AVG_VAS_QG
AVG_VAS_QP
input
interval
num
BEST12.
12
AVG_VAS_QP
AVG_VAS_QTUNE
input
interval
num
BEST12.
12
AVG_VAS_QTUNE
AVG_VAS_SMS
input
interval
num
BEST12.
12
AVG_VAS_SMS
AVG_VAS_XP
input
interval
num
BEST12.
12
AVG_VAS_XP
BLACK_LIST_FLAG
input
binary
num
BEST12.
12
BLACK_LIST_FLAG
COBRAND_CARD_FLAG
input
binary
num
BEST12.
12
COBRAND_CARD_FLAG
CUSTOMER_CLASS
input
ordinal
num
BEST12.
12
CUSTOMER_CLASS
DAYS_TO_CONTRACT_EXPIRY
input
interval
num
BEST12.
12
DAYS_TO_CONTRACT_EXPIRY
GENDER
input
binary
char
$1.00
$1.00
HIGHEND_PROGRAM_FLAG
input
binary
num
BEST12.
12
HIGHEND_PROGRAM_FLAG
HS_AGE
input
interval
num
BEST12.
12
HS_AGE
HS_MODEL
input
nominal
num
BEST12.
12
HS_MODEL
LINE_TENURE
input
interval
num
BEST12.
12
LINE_TENURE
LOYALTY_POINTS_USAGE
input
interval
num
BEST12.
12
LOYALTY_POINTS_USAGE
LST_RETENTION_CAMP
input
interval
num
BEST12.
12
LST_RETENTION_CAMP
LUCKY_NO_FLAG
input
binary
num
BEST12.
12
LUCKY_NO_FLAG
MARITAL_STATUS
input
binary
char
$1.00
$1.00
MARITAL_STATUS
NATIONALITY
input
nominal
num
BEST12.
12
NATIONALITY
NUM_TEL
input
interval
num
BEST12.
12
NUM_TEL
OCCUP_CD
input
nominal
char
$4.00
$4.00
OCCUP_CD
PAY_METD
input
nominal
char
$2.00
$2.00
PAY_METD
STD_CALL
input
interval
num
BEST12.
12
STD_CALL
STD_MINS
input
interval
num
BEST12.
12
STD_MINS
STD_MINS_INTT1
input
interval
num
BEST12.
12
STD_MINS_INTT1
STD_MINS_INTT2
input
interval
num
BEST12.
12
STD_MINS_INTT2
STD_MINS_INTT3
input
interval
num
BEST12.
12
STD_MINS_INTT3
STD_NO_CALLED
input
interval
num
BEST12.
12
STD_NO_CALLED
STD_OP_CALL_RATIO
input
interval
num
BEST12.
12
STD_OP_CALL_RATIO
STD_T1_MINS_CON
input
interval
num
BEST12.
12
STD_T1_MINS_CON
STD_VAS_AR
input
interval
num
BEST12.
12
STD_VAS_AR
STD_VAS_GAMES
input
interval
num
BEST12.
12
STD_VAS_GAMES
STD_VAS_SMS
input
interval
num
BEST12.
12
STD_VAS_SMS
Format
Informat
Variable Label
SERIAL_NUMBER
GENDER
19
SUBPLAN
input
interval
num
BEST12.
12
SUBPLAN
TOP1_INT_CD
input
nominal
char
$4.00
$4.00
TOP1_INT_CD
TOP2_INT_CD
input
nominal
char
$4.00
$4.00
TOP2_INT_CD
TOP3_INT_CD
input
nominal
char
$4.00
$4.00
TOP3_INT_CD
TOT_DELINQ_DAYS
input
interval
num
BEST12.
12
TOT_DELINQ_DAYS
TOT_DIS_1900
input
interval
num
BEST12.
12
TOT_DIS_1900
TOT_PAST_REVPAY
input
ordinal
num
BEST12.
12
TOT_PAST_REVPAY
TOT_PAST_TOS
input
interval
num
BEST12.
12
TOT_PAST_TOS
VAS_AR_FLAG
input
binary
num
BEST12.
12
VAS_AR_FLAG
VAS_CND_FLAG
input
binary
num
BEST12.
12
VAS_CND_FLAG
VAS_GPRS_FLAG
input
binary
num
BEST12.
12
VAS_GPRS_FLAG
VAS_IB_FLAG
input
binary
num
BEST12.
12
VAS_IB_FLAG
VAS_IEM_FLAG
input
binary
num
BEST12.
12
VAS_IEM_FLAG
VAS_VM_FLAG
input
binary
num
BEST12.
12
VAS_VM_FLAG
.
20
Download