Ridge Regression using PROC REG A Fixed Effect Model for Determining the Mixture of AcquisitionSubscription Cost Steven Matthew Anderson Century Link Anderson.Research.co.llc@gmail.com Outline • A Case Study to Introduce Ridge Regression – Description of the Business Problem – Regression Model – Problems with the Model • Ridge Regression Model – Description of the Method – How Does it Work • SAS’s PROC REG – Code – Output • Simulation of the Model • Summary • Future work A Case Study to Introduce Ridge Regression • Terminology – – – – – – Fixed Cost Variable Cost Acquisition Expense Subscription Expense Mixtures of Acquisition and Subscription Expense Side Note: Some Examples of Analysis Using this Cost Structure • The Business Problem • The Regression Model • Problems with the Model Fixed Cost • Fixed costs are business expenses that do not change in proportion to the activity of the business (within a relevant time period) • Discretionary fixed costs • Staff Salaries • Network Management • Data/IP Strategy • Sales Force Management • Most Overhead expense – Arise from annual decisions by management to spend on certain fixed cost items – Costs that do not change significantly over time 25 20 Expense • Committed fixed costs Fixed Cost vs Time 15 10 Adjustment 5 0 0 5 10 15 Time 20 25 Variable Cost • Variable costs are expenses that change in proportion to the activities of the business. • Semi-variable costs are fixed costs that are adjusted periodically to accommodate changes in business activity. • Costs of goods sold • Commissions • Sales Headcount (minus commissions) • Call Center Staffing • Bad Debt Variable Cost vs TIme – Looks like a step function over time 25 Expense • Semi-variable costs are considered in this study to be variable costs. 30 Adjustment 20 15 10 5 0 0 5 10 15 Time 20 25 30 Acquisition Expense – # Sales units (Gross Inwards) – # Call Center employees • Marketing incentives • Sales Headcount • Installation of Service • Design Services (WAN) Acquisition Cost vs Sales Units 25 20 Expense • Can be interpreted as expenses incurred to “Make the Sale.” • Positively Correlated with acquisition activities 15 10 5 0 0 5 10 15 Sales Units (AGI) 20 25 Subscription Expense – Monthly Revenue – # of Revenue Generating Units (RGU) • Repair of services • Collections • Network Monitoring Subscription Cost vs Revenue 30 25 Expense • Can be interpreted as expenses incurred to “Keep the Customer.” • Positively Correlated with Monthly Subscription Activity 20 15 10 5 0 0 5 10 15 Revenue 20 25 Mixed Acquisition/Subscription Expense • Expenses that are positively correlated with both Subscription and Acquisition Activity • Fleet • Construction • Hosting Operations Financial Analysis Examples using this Cost Structure • Break Even Analysis – Used to analyze the potential profitability of an expenditure in a sales based business – Need to find the beakeven point (point where revenue is equal to expense) BEP Fixed Cost Selling Price Variable Cost Picture stolen from Wikipedia Financial Analysis Examples using this Cost Structure • Customer Lifetime Value – Used in Marketing to determine how much each customer is “worth” over time Calculated – R=Revenue – E=Expense T Rtk Etk CLVk t t 0 1 it Rtk Etk R E t t 1 1i t by: T k 0 k 0 Subscription Margink 1 - it t t 1 T AquisitionMargink Description of the Business Problem • Given a particular cost pool (i.e. bucket) – What percentage of the cost pool can be classified as fixed or variable cost? – What percentage of the cost pool can be classified as acquisition or subscription cost? Regression Model • • • • • Expense 0 1 A 2 S 3 AS Expense = Total expense in cost pool A = Acquisition Activity (AGI) S = Subscription Activity (RGU) (AS) = Cross Product Interaction Term Regression Model 100% Subscription Expense Subscription Activity 100% Acquisition Expense Subscription Activity Regression Model Regression Model Answering the Fixed/Variable Expense Question Let A Aquisition Expense,and S Subscription Expense so that the Total Expense 0 1 A 2 S 3 AS 0 AverageFixed Expense Variable Expense Total Expense 0 Percentageof Variable Expense Percentageof Fixed Expense Variable Expense Total Expense Fixed Expense Total Expense 0 Total Expense 1 Variable Expense Total Expense Regression Model Answering the Acquisition/Subscription Question A2 S 2 Total Expense 2 2 1 E E A Let 1 A 1 E , E S 2 S 2 E , E E (Total Expense) Acquisition S and E 2 A 2 S 2 1 E 2 E 2 2 E E E E E E 2 E 2 1 2 2 1 A 2 2 2 1 2 1 2 1 2 2 2 1 2 2 2 1 2 2 2 Subscription 1 Percentage of Subscription Cost Percentage of Acquisition Cost The Results from My Brilliant Model • Variance Inflation Factors are HUGE! • None of the parameter estimates are significant • When parameter estimates were significant: – the confidence intervals around them made the results useless! – The signs were often wrong with respect to reality The Problem Reading the Log • Extreme Cases – SAS Note: Model is not full rank. Least-squares solutions for the parameters are not unique. Some statistics will be misleading. A reported DF of 0 or B means that the estimate is biased. – SAS Note: The following parameters have been set to 0, since the variables are a linear combination of other variables as shown. interaction =-105.877 * Intercept + 13.0209 * ln_agi + 8.13133 * ln_rgu An Example Analysis of Variance ods graphics on; proc reg data=sim_data outvif outest=bob ; model total_expense=A S Interaction / tol vif collin; run; proc print data=bob; run; ods graphics off; Source DF Sum of Squares Mean Square F Value Pr > F Model 3 43231154 14410385 74.77 <.0001 Error 46 8865802 192735 Corrected Total 49 52096956 Parameter Estimates Variable DF Parameter Estimate Standard Error t Value Pr > |t| Tolerance Variance Inflation Intercept 1 14672 20592 0.71 0.4798 . 0 A 1 -4.55289 8.23521 -0.55 0.5830 0.00192 521.23743 S 1 -2.08466 4.09754 -0.51 0.6134 0.00330 302.85512 interaction 1 0.00176 0.00164 1.07 0.2898 0.00128 784.02140 Collinearity Diagnostics Number Eigenvalue Condition Index Proportion of Variation Intercept A S interaction 1 3.99240 1.00000 5.692765E-7 5.758341E-7 5.725649E-7 5.794308E-7 2 0.00482 28.76909 0.00039475 0.00055150 0.00055094 0.00040402 3 0.00277 37.95309 0.00094557 0.00070160 0.00068843 0.00097339 4 0.00000230 1318.75978 0.99866 0.99875 0.99876 0.99862 So What Happened? ( XB)T (Y XB* ) 0 B T X T (Y XB* ) 0 ( AB)T B T AT B T ( X T Y X T XB* ) 0 distribution ( X T Y X T XB* ) B 0 AT B B A ( X T X )B X TY If (XTX) is invertible, then B has a unique solution B=B*. B* ( X T X ) 1 X T Y Basically for XTX to be invertible each column must be a pivot column. If design matrix X has one or more variables that are linear combinations of the other variables, then when you row reduce XTX you are going to get at least one row that has a bunch of zeros in it, and at least one of your columns isn’t going to be a pivot column. Ergo, you do not have a unique solution! Near Multicollinearity means that at least one column is approximately a linear combination of some or all of the others, making XTX near singular. (Enter stage left) Ridge Regression • Modify Least Squares Regression to allow biased estimators of the regression coefficients. • Bias versus precision trade off E(b) Bias of bR B ( X T X ) 1 X T Y is modified t omoveX T X away from near singularity and closer t o t he st at eof ort hogonal it y amongt he columns BR ( X T X kI m 1 ) 1 X T Y Where k≥0 and is known as the biasing or shrinkage parameter We introduce bias by uniformly increasing the diagonal elements and leave the off-diagonal elements invariant E(bR) Methods for Picking a Likely Value of k • Graphically using the Ridge Trace Graph – a plot of the parameters against k and estimating where the coefficients become “stable” • Getting the VIF’s as close to 1 as possible • Staring at the errors and figure out where the RMSE levels off • Using the formula by Hoerl, Kennard, and Baldwin k (m 1) S 2 T OLS OLS Simulation 50 observations Intercept=N(1000,50) Acquisition → N(2500,50) Subscription = 0.7*Acquisition Interaction = acquisition*subscription 1 2 2 2 1 2 2 2 1 2 2 2 1 So “in theory” we should end up with 57% Acquisition and 43% Subscription k (m 1)S2 T OLS OLS (4)(57651) 0.01217 18943761.8 SAS’s PROC REG ods graphics on; proc reg data=sim_data outvif outest=rb ridge=0 to 0.03 by .001; title 'Ridge Regression with PROC REG'; model total_expense=A S Interaction / tol vif collin; run; ods graphics off; SAS Ridge Plots SAS Diagnostics SAS Diagnostics II SAS Output Dataset Type of statistics Ridge regression control value Root mean squared error PARMS Intercept A S interaction 240.1072 4352.4418 1.4511 -3.1776 1.28E-03 difference in rmse RIDGE 0 240.1072 4352.4418 1.4511 -3.1776 1.28E-03 RIDGE 0.001 240.4279 2518.0393 1.8645 -1.7268 8.74E-04 13.3446 RIDGE 0.009 242.0831 616.1069 1.6862 0.5524 4.71E-04 4.6013 RIDGE 0.01 242.1817 565.9577 1.6599 0.6410 4.61E-04 4.0718 RIDGE 0.011 242.2697 524.0733 1.6362 0.7175 4.52E-04 3.6324 RIDGE 0.012 242.3488 488.6401 1.6147 0.7842 4.45E-04 3.2640 RIDGE 0.013 242.4203 458.3412 1.5953 0.8428 4.38E-04 2.9523 RIDGE 0.014 242.4855 432.1970 1.5776 0.8948 4.33E-04 2.6867 RIDGE 0.015 242.5451 409.4631 1.5615 0.9412 4.28E-04 2.4585 RIDGE 0.028 243.0417 268.0123 1.4331 1.2765 3.94E-04 1.1248 RIDGE 0.029 243.0680 263.2177 1.4269 1.2911 3.92E-04 1.0824 RIDGE 0.03 243.0934 258.8830 1.4211 1.3048 3.91E-04 1.0441 SAS Output Dataset Ridge regression control value Type of statistics A S interaction 0 RIDGEVIF 244.8223 228.4689 530.7665 0.001 RIDGEVIF 113.7915 110.8910 164.5080 0.009 RIDGEVIF 14.5425 14.9128 8.0670 0.01 RIDGEVIF 12.5768 12.9119 6.7163 0.011 RIDGEVIF 10.9903 11.2939 5.6825 0.012 RIDGEVIF 9.6907 9.9662 4.8737 0.013 RIDGEVIF 8.6122 8.8629 4.2289 0.014 RIDGEVIF 7.7071 7.9359 3.7067 0.015 RIDGEVIF 6.9398 7.1492 3.2779 0.028 RIDGEVIF 2.5530 2.6368 1.0876 0.029 RIDGEVIF 2.4088 2.4880 1.0239 0.03 RIDGEVIF 2.2770 2.3519 0.9663 Simulation Results Model: (57% Subscription, 43%Acquistion) Expense =1,000+(Acquisition)+(Subscription)+(Interaction) OLS: (184.1% Subscription, -84.1%Acquistion) Expense = 4352.442– 1.4511(Acquisition) – 3.1776(Subscription) + (1.28E-03)(Interaction) SAS Ridge: (67.3% Subscription, 32.7%Acquistion) Expense = 488.64 + 1.61(Acquisition) + 0.784(Subscription) + 3.624(Interaction) Summary • Ridge Regression corrects for multicollinearity problems by modifying the method of least squares to allow more precise biased estimators. • Allows me to perform Customer Lifetime Value and Breakeven Analysis with existing correlated regressors • Not perfect but better than OLS Estimation • SAS needs some additional functionality – Confidence intervals for Bi’s – Confidence intervals for k Next Steps • Implementing other methodology for choosing shrinkage parameter • Dorugade and Kashid (2009) • Mardikyan and Cetin (2008) • Lawless and Wang (kLW) (1976) • Add to SAS – Confidence Intervals • Firinguetti & Bobadilla’s Asymptotic Confidence Intervals • Crivelli, Firinguetti & Montano’s Boot Strapping Confidence Intervals • Feig’s Monte Carlo method for Evaluating Confidence Intervals