Regression Using Boosting Vishakh (vv2131@columbia.edu) Advanced Machine Learning Fall 2006 Introduction ● Classification with boosting – – – ● Well-studied Theoretical bounds and guarantees Empirically tested Regression with boosting – – – Rarely used Some bounds and guarantees Very little empirical testing Project Description ● Study existing algorithms & formalisms – – – – ● AdaBoost.R (Fruend & Schapire, 1997) SquareLev.R (Duffy & Helmbold, 2002) SquareLev.C (Duffy & Helmbold, 2002) ExpLev (Duffy & Helmbold, 2002) Verify effectiveness by testing on interesting dataset. – Football Manager 2006 A Few Notes ● ● Want PAC-like guarantees Can't directly transfer processes from classification – – ● Simply re-weighting distribution over iterations doesn't work. Can modify samples and still remain consistent with original function class. Performing gradient descent on a potential function. SquareLev.R ● ● ● ● ● ● ● Squared error regression. Uses regression algorithm for base learner. Modifies labels, not distribution. Potential function uses variance of residuals. New label proportional to negative gradient of potential function. Each iteration, mean squared error decreases by a multiplicative factor. Can get arbitrarily small squared error as long as correlation between residuals and predictions > threshold. SquareLev.C ● ● ● ● ● Squared error regression Use a base classifier Modifies labels and distribution Potential function uses residuals New label sign of instance's residual ExpLev ● ● ● ● ● ● ● Attempts to get small residuals at each point. Uses exponential potential. AdaBoost pushes all instances to positive margin. ExpLev pushes all instances to have small residuals Uses base regressor ([-1,+1]) or classifier ({1,+1}). Two-sided potential uses exponents of residuals. Base learner must perform well with relabeled instances. Naive Approach ● ● ● Directly translate AdaBoost to the regression setting. Use thresholding of squared error to reweight. Use to compare test veracity of other approaches Dataset ● Data from Football Manager 2006 – – ● ● ● Very popular game Statistically driven Features are player attributes. Labels are average performance ratings over a season. Predict performance levels and use learned model to guide game strategy. Work so far ● ● ● Conducted survey Studied methods and formal guarantees and bounds. Implementation still underway. Conclusions ● ● ● Interesting approaches and analyses of boosting regression available. Insufficient real-world verification. Further work – – Regressing noisy data Formal results for more relaxed assumptions