Ensemble Methods Predictive Analytics: Week 5 Yicheng Song Assistant Professor Carlson School of Management ycsong@umn.edu Outline • Ensemble Methods – Bagging – Boosting – Stacking Predictive Analytics (MSBA 6420) Why Ensemble Methods? Predictive Analytics (MSBA 6420) Predictive Analytics (MSBA 6420) Predictive Analytics (MSBA 6420) Predictive Analytics (MSBA 6420) Ensemble Example Predictive Analytics (MSBA 6420) Ensemble Methods Predictive Analytics (MSBA 6420) Diversity in Predictive Models Predictive Analytics (MSBA 6420) Where Error Comes From Predictive Analytics (MSBA 6420) Theoretical Insight into Ensemble Performance Predictive Analytics (MSBA 6420) Various Techniques & Bias-Variance Trade-off Predictive Analytics (MSBA 6420) Bagging N training examples Set 1 Set 2 Set 3 Set 4 Sampling N’ examples with replacement (usually N=N’) Function 1 Function 2 Function 3 Predictive Analytics (MSBA 6420) Function 4 Bagging Testing data x Function 1 Function 2 Function 3 Function 4 y1 y2 y3 y4 Average/voting Predictive Analytics (MSBA 6420) This approach would be helpful when your model is complex, easy to overfit. General Approach 1: Bagging Predictive Analytics (MSBA 6420) Cons of Decision Tree • Unstable – Small Change in input data – Large effect on the structure of the tree • Reason – Hieratical Nature of the tree-growing process Predictive Analytics (MSBA 6420) Solution: Random Forest post Predictive Analytics (MSBA 6420) Performance Comparison on UCI Glass Dataset Error Rate Random forest Number of basic learners Predictive Analytics (MSBA 6420) General Approach 2: Boosting Predictive Analytics (MSBA 6420) AdBoosting Example https://towardsdatascience.com/boosting-algorithm-adaboost-b6737a9ee60c Predictive Analytics (MSBA 6420) Boosting Workflow πΌπ = ππ 1 − ππ Τππ Predictive Analytics (MSBA 6420) Gradient Boosting • We first model data with simple models and analyze data for errors. • These errors signify data points that are difficult to fit by a simple model. • Then for later models, we particularly focus on those hard to fit data to get them right. • In the end, we combine all the predictors by giving some weights to each predictor. https://towardsdatascience.com/boosting-algorithm-gbm-97737c63daa3 Predictive Analytics (MSBA 6420) Gradient Boosting Playground • http://arogozhnikov.github.io/2016/06/24/gradient_b oosting_explained.html Predictive Analytics (MSBA 6420) Comparison Predictive Analytics (MSBA 6420) https://towardsdatascience.com/https-medium-com-vishalmorde-xgboostalgorithm-long-she-may-rein-edd9f99be63d Predictive Analytics (MSBA 6420) eXtreme Gradient Boosting https://www.analyticsvidhya.com/blog/2016/03/complete-guide-parameter-tuning-xgboost-with-codes-python/ https://towardsdatascience.com/boosting-algorithm-xgboost-4d9ec0207d https://stats.stackexchange.com/questions/282459/xgboost-vs-python-sklearn-gradient-boosted-trees Predictive Analytics (MSBA 6420) Predictive Analytics (MSBA 6420) Xgboost Examples https://github.com/dmlc/xgboost/tree/master/demo Predictive Analytics (MSBA 6420) Xgboost and Beyond https://arxiv.org/pdf/1809.04559.pdf https://www.kaggle.com/nschneider/gbm-vs-xgboost-vs-lightgbm Predictive Analytics (MSBA 6420) Stacking Voting Model 1 y Model 2 y Model 3 y Model 4 y x Predictive Analytics (MSBA 6420) Majority Vote Training Data 1 Training Data 2 Model 1 y Model 2 y Testing Data x Model 3 y Model 4 y as new feature Predictive Analytics (MSBA 6420) Final Classifier General Approach 3: Stacking Predictive Analytics (MSBA 6420) Overfitting? • Does ensemble (boosting) cause overfitting? – over fit some parts of the data – but therefor will under fit other parts of the data. – over fit that point (those points) will be average with the under fitting • stackexchange post1 post2 post3 Predictive Analytics (MSBA 6420) Multi-Class Classification Using Binary Classifiers Predictive Analytics (MSBA 6420) Multi-Class Classification with ECOC Predictive Analytics (MSBA 6420) Why Xgboost can Parallelized? Method 1: Parallelize Node Building at Each Level Method 2: Parallelize Split Finding on Each Node http://zhanpengfang.github.io/418home.html Predictive Analytics (MSBA 6420)