Litmus: Robust Assessment of Changes in Cellular Networks Ajay Mahimkar, Zihui Ge, Jennifer Yates, Chris Hristov*, Vincent Cordaro*, Shane Smith*, Jing Xu*, Mark Stockert* AT&T Labs – Research * AT&T Mobility Services ACM CoNEXT 2013, Santa Barbara, CA 1 © 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are the property of their respective owners. Cellular network changes Network changes and assessment Software upgrades, configuration changes, … How does it impact user perception of service quality? Circuit switched core Packet switched core Voice and data connection attempts (Accessibility) Successful termination of ongoing calls (Retainability) Data throughput, Voice Erlangs, … Data Voice Radio Network Controller Extensive testing in labs before deployment in the field However, no lab can fully replicate scale, complexity and diversity of large-scale operational networks Cell Tower First Field Application (FFA) 2 Before rolling out the change network-wide, conduct small scale testing in operational network © 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are the property of their respective owners. FFA Impact Assessment of FFA Performance Impacts Improvement Analyze the performance impact of FFA No change Degradation FFA pre/post impact analysis of service performance Compare service performance after FFA with that of before If FFA is successfully trialed and shows expected performance impacts, then it can be rolled out network-wide Go/no-go decision is crucial Challenges: external factors can make assessment difficult 3 © 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are the property of their respective owners. Go/no-go decision for a wide-scale roll-out Dependency on external factors Service performance in cellular networks is influenced by several external factors Weather (heavy rainfall introduces obstruction for radio signals) Terrain (Mountains/flat surfaces/tall buildings have different propagation properties) User population densities and mobility patterns Seasonal changes (foliage or leaves budding) Traffic pattern changes (holidays, major events or trade shows) Other network events (outages or maintenance activities in other parts of network) Configuration change accidentally co-occurs with strong winds that negatively impacted service performance Unnecessary roll-back of change without knowledge of impact of strong winds 4 © 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are the property of their respective owners. Dependency on external factors Yearly seasonality in Voice Retainability for UMTS cell towers due to foliage Assessment of changes would be difficult because of seasonal changes Degradations in Voice Accessibility across multiple RNCs due to severe storms and damaging hail during a tornado Assessment of changes at RNCs would be difficult because of weather impact 5 © 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are the property of their respective owners. Dependency on external factors Dramatic traffic pattern change during holidays induces significant changes in Voice Retainability Assessment of changes would be difficult because of traffic pattern changes Pre/post impact analysis of FFA changes needs to account for the overshadowing effects of external factors Improvements in Voice Retainability across a majority of cell towers due to software upgrade at an upstream RNC Assessment of changes at cell towers would be difficult because of upstream RNC changes 6 © 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are the property of their respective owners. Litmus idea – study/control comparison Compare performance between study and control group Study group – network elements where change is implemented Control group – network elements without the change Circuit switched core Voice Intuition Performance at geographically nearby elements is correlated External factor influences performance at both study and control A performance impacting change at study will change the dependency between study and control Radio Network Controller (RNC) Study FFA Litmus Solution Robust spatial regression algorithm Domain knowledge guided control group selection 7 © 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are the property of their respective owners. Data Control Challenges Unrelated performance changes in a small number of control Cell Tower group member Poor selection of control group Packet switched core User Equipments Litmus comparison to related work Study-group only analysis Mercury [SIGCOMM’10], PRISM [CoNEXT’11], Spectroscope [NSDI’11], … Does not account for impact of unrelated external factors Control (A) Study (B) A/B testing – also known as split testing, control/treatment Popular in web domains for data driven decision making [KDD’07,’12] Web users randomly exposed to the two variants of experiment Why doesn’t it apply in our context? Feedback Tight coupling between experiment and assessment Control group might be subject to other network events such as changes or unplanned outages Difference in Differences (DiD) Compare mean/median difference between study and control before and after the change Why doesn’t it apply in our context? 8 Serving users Contamination of forecast due to poor selection of control group at lake Sensitivity to performance changes in a small number of control group © 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are the property of their respective owners. Serving users at business location Robust spatial regression Before change After change Study Study Multiple iterations of forecast difference comparison increases the robustness to a few bad members in the control group f Regression coefficients f Forecast study Forecast study Delta 9 Sampled Control Sampled Control f Robust rankorder tests © 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are the property of their respective owners. Delta Output: Degradation/Improvement/No change Domain knowledge guided control group selection Guidelines for control group selection Subject to same external factors as the study group Share similar properties with study group such as geographical proximity or configuration Control group size Not too large: difficult to capture similar impact due to external factor Not too small: loose benefits of robustness in spatial regression analysis Attributes for selection Geographical distance using latitude/longitude and zip-code Topological structure of the cellular network Configuration settings such as software version, or equipment model Predicates to select control group Uni-variate – single attribute (for e.g., LTE cell towers within the same zip-code) Multi-variate – combination of attributes (for e.g., UMTS cell towers with same RNC and same OS) 10 © 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are the property of their respective owners. Litmus evaluation Evaluation conducted using data collected from operational cellular networks Lack of complete ground truth makes evaluation extremely challenging Two-step methodology A-priori known changes and assessment by Engineering & Ops Manually conducted before through visual inspection & analysis Synthetic injection of changes in performance time-series at cell towers Real examples Thorough and exhaustive evaluation Compare Litmus with Difference in Differences (DiD) and study-group only analysis Accuracy computation ALGORITHM OUTCOME EXPECTATION Improvement Degradation No impact Improvement True positive False negative False negative Degradation False negative True positive False negative No impact False positive False positive True negative Result summary Litmus outperformed study-group only analysis because of robustness to external factors Litmus outperformed DiD because of robustness to a small number bad members in control group 11 © 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are the property of their respective owners. Evaluation results Evaluation using known assessments Litmus outperforms 120 Evaluation using synthetic injection 120 DiD due to zero false negatives 100 100 Study Group Only 80 60 Difference in Differences Study Group Only 80 60 Difference in Differences 40 40 Litmus Robust Spatial Regression 20 0 Precision Recall True Negative Accuracy Litmus Robust Spatial Regression 20 0 Precision Recall True Negative Accuracy Precision = TP / (TP Study + FP)group only True Negative = TN / (TN + FP) analysis has poor Recall = TP / (TP + FN) Accuracy = (TP + TN) / (TP + TN + FP + FN) accuracy due to high FP and FN Compared to study group only analysis and DiD, Litmus is robust to external factors and accurately conducts the impact assessment 12 © 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are the property of their respective owners. Litmus operational experiences Litmus is being heavily used for FFA impact assessment in production cellular networks Pre/post impact analysis across a wide variety of performance metrics Outcome is used for a go or no-go decision for wide-scale deployment of FFA change Change type Location Impact Expectation Impact Assessment by Litmus External factor Reduce start-up times for data sessions Radio Network Controller (RNC) No degradation in voice Degradation in voice None Configuration changes Mobile Switching Center (MSC) – Voice switch Improvement in voice No improvement Foliage SON load balancing and neighbor discovery Cell Towers Improvement in call connection Improvement Hurricane Sandy Improve cell change success rates Radio Network Controller (RNC) Improvement in call retention No improvement Traffic pattern changes due to holiday 13 © 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are the property of their respective owners. Go/no-go decision Impact of SON during hurricane Sandy SON (Self Optimizing Network) features were being trialed on some cell towers SON Capabilities: automated load balancing, neighbor discovery and self-configuration Key question: How did SON perform during hurricane Sandy? Both study and control group were impacted due to Sandy; however study group did better than control 14 This question cannot be answered without comparison to control group Control group has to be within the Sandyimpacted region The recovery on study group was also faster than on control group SON did a good job ! SON features were rolledout network-wide © 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are the property of their respective owners. FFA to improve cell change success rate FFA change applied at a few RNCs Expectation: Improvement in data retainability Study-group *only* analysis would have led to improvement inference and recommendation made for nation-wide roll-out After comparing to control, Litmus identified that improvement was really due to holidays 15 Traffic pattern changes induced improvements in data retainability across both study & control FFA change thus was not inducing performance improvements Decision was made not to roll-out based on Litmus results © 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are the property of their respective owners. Conclusions and Future Work Litmus – an automated tool for robust assessment of changes in cellular networks Carefully accounts for external factors such as foliage, weather, holidays, or network events New spatial regression algorithm for robust performance comparison of study versus control Domain knowledge guided control group selection Outperforms study-group only analysis and Difference in Differences (DiD) Operational Experiences Litmus is being used successfully in go/no-go decisions for wide-scale deployment of changes Considerably improved the assessment accuracy and analysis time Future Work Continue to improve methodology for control group selection Apply to other networks and services such as clouds, data centers Extend Litmus to device specific monitoring – e.g., Apple iPhone, Samsung Galaxy or Nokia Lumia 16 © 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are the property of their respective owners. Thank You ! Questions ? 17 © 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are the property of their respective owners.