READ ME FIRST * THEN DELETE THE SLIDE

Litmus: Robust Assessment of Changes in
Cellular Networks
Ajay Mahimkar, Zihui Ge, Jennifer Yates, Chris Hristov*,
Vincent Cordaro*, Shane Smith*, Jing Xu*, Mark Stockert*
AT&T Labs – Research
* AT&T Mobility Services
ACM CoNEXT 2013, Santa Barbara, CA
1
© 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein
are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are
the property of their respective owners.
Cellular network changes
Network changes and assessment
Software upgrades, configuration changes, …
 How does it impact user perception of service quality?
Circuit
switched core
Packet
switched core




Voice and data connection attempts (Accessibility)
Successful termination of ongoing calls (Retainability)
Data throughput, Voice Erlangs, …
Data
Voice
Radio
Network
Controller
Extensive testing in labs before deployment in the field
 However, no lab can fully replicate scale, complexity and
diversity of large-scale operational networks
Cell Tower

First Field Application (FFA)

2
Before rolling out the change network-wide, conduct
small scale testing in operational network
© 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein
are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are
the property of their respective owners.
FFA
Impact Assessment of FFA
Performance Impacts
Improvement
Analyze the
performance
impact of FFA
No change
Degradation
FFA pre/post impact analysis of service performance
Compare service performance after FFA with that of before
 If FFA is successfully trialed and shows expected performance
impacts, then it can be rolled out network-wide
 Go/no-go decision is crucial
 Challenges: external factors can make assessment difficult

3
© 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein
are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are
the property of their respective owners.
Go/no-go decision for
a wide-scale roll-out
Dependency on external factors
Service performance in cellular networks is influenced by several external factors






Weather (heavy rainfall introduces obstruction for radio signals)
Terrain (Mountains/flat surfaces/tall buildings have different propagation properties)
User population densities and mobility patterns
Seasonal changes (foliage or leaves budding)
Traffic pattern changes (holidays, major events or trade shows)
Other network events (outages or maintenance activities in other parts of network)
Configuration change accidentally co-occurs with strong
winds that negatively impacted service performance
Unnecessary roll-back of change without
knowledge of impact of strong winds
4
© 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein
are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are
the property of their respective owners.
Dependency on external factors
Yearly seasonality in Voice Retainability for
UMTS cell towers due to foliage
Assessment of changes would be difficult
because of seasonal changes
Degradations in Voice Accessibility across multiple RNCs due
to severe storms and damaging hail during a tornado
Assessment of changes at RNCs would be
difficult because of weather impact
5
© 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein
are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are
the property of their respective owners.
Dependency on external factors
Dramatic traffic pattern change during holidays induces
significant changes in Voice Retainability
Assessment of changes would be difficult
because of traffic pattern changes
Pre/post impact analysis of FFA changes needs to account for
the overshadowing effects of external factors
Improvements in Voice Retainability across a majority of cell
towers due to software upgrade at an upstream RNC
Assessment of changes at cell towers would
be difficult because of upstream RNC changes
6
© 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein
are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are
the property of their respective owners.
Litmus idea – study/control comparison
Compare performance between study and control group
Study group – network elements where change is implemented
 Control group – network elements without the change

Circuit
switched core
Voice
Intuition
Performance at geographically nearby elements is correlated
 External factor influences performance at both study and control
 A performance impacting change at study will change the
dependency between study and control

Radio Network
Controller (RNC)
Study

FFA
Litmus Solution
Robust spatial regression algorithm
 Domain knowledge guided control group selection

7
© 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein
are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are
the property of their respective owners.
Data
Control
Challenges
Unrelated performance changes in a small number of control
Cell Tower
group member
 Poor selection of control group
Packet
switched core
User Equipments
Litmus comparison to related work
Study-group only analysis
Mercury [SIGCOMM’10], PRISM [CoNEXT’11], Spectroscope [NSDI’11], …
 Does not account for impact of unrelated external factors

Control (A)
Study (B)
A/B testing – also known as split testing, control/treatment
Popular in web domains for data driven decision making [KDD’07,’12]
 Web users randomly exposed to the two variants of experiment
 Why doesn’t it apply in our context?



Feedback
Tight coupling between experiment and assessment
Control group might be subject to other network events such as changes or unplanned outages
Difference in Differences (DiD)
Compare mean/median difference between study and control before and after the change
 Why doesn’t it apply in our context?



8
Serving users
Contamination of forecast due to poor selection of control group
at lake
Sensitivity to performance changes in a small number of control group
© 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein
are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are
the property of their respective owners.
Serving users
at business
location
Robust spatial regression
Before change
After change
Study
Study
Multiple iterations of forecast difference comparison increases the
robustness to a few bad members in the control group
f
Regression
coefficients
f
Forecast
study
Forecast
study
Delta
9
Sampled Control
Sampled Control
f
Robust rankorder tests
© 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein
are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are
the property of their respective owners.
Delta
Output: Degradation/Improvement/No change
Domain knowledge guided control group
selection
Guidelines for control group selection
Subject to same external factors as the study group
 Share similar properties with study group such as geographical proximity or configuration

Control group size
Not too large: difficult to capture similar impact due to external factor
 Not too small: loose benefits of robustness in spatial regression analysis

Attributes for selection
Geographical distance using latitude/longitude and zip-code
 Topological structure of the cellular network
 Configuration settings such as software version, or equipment model

Predicates to select control group
Uni-variate – single attribute (for e.g., LTE cell towers within the same zip-code)
 Multi-variate – combination of attributes (for e.g., UMTS cell towers with same RNC and same OS)

10
© 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein
are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are
the property of their respective owners.
Litmus evaluation
Evaluation conducted using data collected from operational cellular networks
Lack of complete ground truth makes evaluation extremely challenging
 Two-step methodology




A-priori known changes and assessment by Engineering & Ops
Manually conducted before through visual inspection & analysis
Synthetic injection of changes in performance time-series at cell towers
Real examples
Thorough and exhaustive evaluation
Compare Litmus with Difference in Differences (DiD) and study-group only analysis
Accuracy computation
ALGORITHM OUTCOME
EXPECTATION
Improvement
Degradation
No impact
Improvement
True positive
False negative
False negative
Degradation
False negative
True positive
False negative
No impact
False positive
False positive
True negative
Result summary
Litmus outperformed study-group only analysis because of robustness to external factors
 Litmus outperformed DiD because of robustness to a small number bad members in control group

11
© 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein
are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are
the property of their respective owners.
Evaluation results
Evaluation
using known assessments
Litmus outperforms
120
Evaluation using synthetic injection
120
DiD due to zero false
negatives
100
100
Study Group
Only
80
60
Difference in
Differences
Study Group
Only
80
60
Difference in
Differences
40
40
Litmus Robust
Spatial
Regression
20
0
Precision
Recall
True Negative
Accuracy
Litmus Robust
Spatial
Regression
20
0
Precision
Recall
True Negative
Accuracy
Precision = TP / (TP Study
+ FP)group only True Negative = TN / (TN + FP)
analysis has poor
Recall = TP / (TP + FN)
Accuracy = (TP + TN) / (TP + TN + FP + FN)
accuracy due to high
FP and FN
Compared to study group only analysis and DiD, Litmus is robust to external
factors and accurately conducts the impact assessment
12
© 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein
are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are
the property of their respective owners.
Litmus operational experiences
Litmus is being heavily used for FFA impact assessment in production cellular networks
Pre/post impact analysis across a wide variety of performance metrics
 Outcome is used for a go or no-go decision for wide-scale deployment of FFA change

Change type
Location
Impact
Expectation
Impact Assessment
by Litmus
External factor
Reduce start-up
times for data
sessions
Radio Network
Controller (RNC)
No degradation
in voice
Degradation in voice
None
Configuration
changes
Mobile Switching
Center (MSC) –
Voice switch
Improvement in
voice
No improvement
Foliage
SON load balancing
and neighbor
discovery
Cell Towers
Improvement in
call connection
Improvement
Hurricane Sandy
Improve cell change
success rates
Radio Network
Controller (RNC)
Improvement in
call retention
No improvement
Traffic pattern changes
due to holiday
13
© 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein
are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are
the property of their respective owners.
Go/no-go
decision
Impact of SON during hurricane Sandy
SON (Self Optimizing Network) features were
being trialed on some cell towers
SON Capabilities: automated load balancing,
neighbor discovery and self-configuration
 Key question: How did SON perform during
hurricane Sandy?




Both study and control group were impacted
due to Sandy; however study group did better
than control


14
This question cannot be answered without
comparison to control group
Control group has to be within the Sandyimpacted region
The recovery on study group was also faster
than on control group
SON did a good job ! SON features were rolledout network-wide
© 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein
are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are
the property of their respective owners.
FFA to improve cell change success rate
FFA change applied at a few RNCs
Expectation: Improvement in data retainability
 Study-group *only* analysis would have led to
improvement inference and recommendation
made for nation-wide roll-out
 After comparing to control, Litmus identified
that improvement was really due to holidays




15
Traffic pattern changes induced improvements
in data retainability across both study & control
FFA change thus was not inducing performance
improvements
Decision was made not to roll-out based on
Litmus results
© 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein
are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are
the property of their respective owners.
Conclusions and Future Work
Litmus – an automated tool for robust assessment of changes in cellular networks
Carefully accounts for external factors such as foliage, weather, holidays, or network events
 New spatial regression algorithm for robust performance comparison of study versus control
 Domain knowledge guided control group selection
 Outperforms study-group only analysis and Difference in Differences (DiD)

Operational Experiences
Litmus is being used successfully in go/no-go decisions for wide-scale deployment of changes
 Considerably improved the assessment accuracy and analysis time

Future Work
Continue to improve methodology for control group selection
 Apply to other networks and services such as clouds, data centers
 Extend Litmus to device specific monitoring – e.g., Apple iPhone, Samsung Galaxy or Nokia Lumia

16
© 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein
are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are
the property of their respective owners.
Thank You !
Questions ?
17
© 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein
are trademarks of AT&T Intellectual Property and/or AT&T affiliated companies. All other marks contained herein are
the property of their respective owners.