1. My project is how to build a model for predicting whether customers will be interested in the vehicle insurance by using machine learning. 2. Here is the content of this presentation 3. The background of my task is that an insurance company want to know if their current customers will be interested in the vehicle insurance 4. Therefore we can refine this task into a more specific problem, which variable will have a significant effect on the customers interest, through calculating the correlation, we aim to determine the effects of a binary factor, vehicle damage or not. The null hypothesis for this study is simply that Vehicle damage and vehicle not damage has the same performance on the response. the alternative hypothesis is that Vehicle damage has a better performance on the response than vehicle not damage. Besides that, some relationship between independent variables will be mentioned in the result. 5. Before modelling we need to do the following preprocessing work for the training data. 6. Basically, we train logistic regression model and classification model, such as decision tree, random forest for this binary problem, Set the parameters manually first then apply grid search to select parameters to better fit the model. 7. For evaluating the model ,we use 5% as the significance level to test the reliability and classification f1 score, accuracy score, auc score to measure the effectiveness, McNemar's test for comparing different models. 8. However, when it comes to similar accuracy score when compare two different performance model, for this vehicle insurance problem, we focus more on whether customers are interested in or not, therefore the higher precision score the better. For this reason the final model is logistic regression model. Because there is no significant difference between logistic regression and random forest, the feature importance of random forest can be used for testing the null hypothesis, Vehicle with damage is regarded as a much more important factor for customers to purchase car insurance, rather than vehicle with no damage. 9. As a result, combine the solution with the inner correlation of variables from spearman correlation, the selling strategy should focus more on the group of elder people whose cars were bought within one year but had damaged already and looking for the new insurers, choose traditional agents as the main communication method. Above all is my presentation, thank you.