Prescriptive Analysis Additional derived variables introduced after the past two weeks of analysis Compare best two models using ROC plot Final choice of deployment The final choice of model for deployment I chose was the linear regression model and the optimized KPI That we decided to focus on were 0.3 , 0.5 (moderator) and 0.6 regarding the confusion matrix of each Summary of the lr model loan screening glm(formula = default ~ credit.policy + int.rate + fico + log.annual.inc + delinq.2yrs + revol.util + purpose + inq.last.6mths, family = "binomial", data = train) purposesmall_business 0.251542 0.244660 1.028 0.30389 inq.last.6mths 0.083533 0.044026 1.897 0.05778 . --Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Coefficients: Estimate Std. Error z value Pr(>|z|) Null deviance: 2958.2 on 2133 degrees of freedom (Intercept) 0.294343 307.410265 0.001 0.99924 Residual deviance: 1987.1 on 2120 degrees of freedom credit.policy -18.388313 307.402963 -0.060 0.95230 AIC: 2015.1 int.rate 58.023942 3.906555 14.853 < 2e-16 *** fico 0.021969 0.002511 8.750 < 2e-16 *** Number of Fisher Scoring iterations: 17 log.annual.inc -0.461259 0.100242 -4.601 4.2e-06 *** delinq.2yrs -0.265352 0.122426 -2.167 0.03020 * Graphical representantion of defaulting probability by the lr model 6 Recap the different models and describe any new variables or methods tried and lessons learnt The logistic regression (LR) model stands out for its effectiveness, supported by statistically significant coefficients, particularly in key features like interest rate (int.rate), FICO score, and revolving utilization (revol.util) with substantial impact, indicated by significant coefficients and low p-values. The model's robust fit to the data is underscored by deviance metrics, notably the substantially lower residual deviance compared to the null deviance. In contrast, the random forest (RF) model's assessment relies on MeanDecreaseGini values for variable importance. For a more direct comparison, considering performance metrics such as accuracy, precision, recall, and F1-score, along with the examination of confusion matrices on the test set, would provide insights into the models' performances in terms of true positives, true negatives, false positives, and false negatives. The decision to prefer the LR model over the RF model is rationalized by specific numerical comparisons, emphasizing the LR model's strengths in interpretability and computational efficiency. Visuals or descriptions to best compare your two most effective models (the tree model has been included and it has a discrete curve) Optimized Threshold 9 Optimized Threshold 10 Describing additional insights 11 Describing additional insights 12