Group 18

advertisement
Cellular Network Performance
Measurement
Class Presentation for CS 234 - Advanced Networks
by Pramit Choudary, Balaji Raao & Ravindra
Bhanot (Group 18)
Instructor: Professor Nalini Venkatasubramanian
05/10/2012
Papers considered
• Paper 1: Understanding Traffic Dynamics in
Cellular Data Networks by U. Paul, A.
Subramanian, M. Buddhikot, S. Das, IEEE
INFOCOM 2011, Shanghai, China
• Paper 2: An Untold Story of Middleboxes in
Cellular Networks, SIGCOMM 2011, Toronto,
Ontario, Canada
(NOTE: Please refer to the relevant papers listed above in place of ‘paper
1’ or ‘paper 2’ found in the presentation slides.)
2
Background - Internet/Data Access?
• Dial-up connection
• Broadband (DSL, Cable Internet, Fiber Optics)
• Wi-Fi (IEEE 802.11 standard) & WiMAX (IEEE
802.16 standard)
• Mobile Broadband using 2.5G, 3G, 4G
technologies
Each claim to cater different data rates, ranges in
operation, needs of end user/application, energy
savings, etc using different protocol designs, business
strategies, network deployments and many more.
3
Background - Cellular Networks and
interconnecting subsystems
4G: Fourth generation of cell phone mobile communications standard
3G: Third generation of cell phone mobile communications standard
Femtocell: Small cellular base station designed for use in a home or small business
IMS: IP Multimedia Subsystem, used to provide mobile and fixed multimedia services
Image courtesy: radisys.com
4
Background - Broadband Cellular Networks
• E.g. HSPA - Mobile telephony protocols used in
3G cellular networks for mobile data access.
• Broadband cellular access becoming most
common and pervasive world-wide.
• Fueled by introduction of user-friendly smart
phones, notebooks, tablets, eBook readers.
5
Background - A look at smartphone technology
Courtesy: Technology Review, Published by MIT, May 9th 2012
6
Background - Broadband Cellular Networks
• Has led to innovative & flashy mobile applications
like gaming, video streaming, social networking,
etc.
• Use of several and various types of middleboxes
to manage the scarce resources (because same
resources are shared mostly) in the network and
to protect them e.g. Network Address Translation
(NAT) boxes, firewalls, etc.
7
Background - On usage of middleboxes
• Many times, cellular network middleboxes (deployed by carriers like
AT&T, T-Mobile) and mobile applications (application developers) –
managed independently.
• Knowledge mismatch -> End-to-end performance degradation, Increase
in energy consumption, Introduce security vulnerabilities.
• E.g. Carrier setting aggressive timeout for inactive TCP connections in
the firewall and disrupting long lived and occasionally idle connections
maintained by applications like instant messaging, push-based email, etc.
• Need for understanding the effects of middleboxes in cellular network.
• Paper 2 specifically focuses on NAT boxes, their policies & firewall and
its policies.
8
Background - Broadband Cellular Networks (Contd.)
• Expectations in increase in the volume of data
seen exponentially.
• Supporting such an increase requires good
understanding of traffic dynamics and its impact
on resource allocation on the service provider’s
network.
• Leading to better resource planning, network
designs, spectrum allocation and energy savings.
9
Background - Broadband Cellular Networks (Contd.)
• For some exciting numbers, refer to a white
paper by Cisco on global mobile data traffic
forecast for 2011-2016:
http://www.cisco.com/en/US/solutions/collat
eral/ns341/ns525/ns537/ns705/ns827/white_
paper_c11-520862.html
10
Paper 1 - Short Summary
• Discuss traffic dynamics specific to 3G cellular
networks.
• End user perspective: Study subscriber traffic
patterns, number of distinct base stations visited
by subscribers, relate mobility and traffic,
subscriber temporal activity & relate subscriber
activity and traffic.
• Network component perspective: Study
aggregated load at base stations, base station
load distributions, spatial characteristics,
temporal characteristics and spatio-temporal
characteristics of network load at base stations.
11
Paper 1 - Short Summary (Contd.)
• Provide implications on the measurements and
observations made.
• Test conducted in 2007 for a week over a US
nation-wide network with thousands of base
stations and with entire subscriber base (order of
hundreds of thousands i.e. close to a million).
• Performed measurements on all generated data
packet headers (not including payloads) and on
signaling & accounting packets.
12
Paper 1 - Subscriber Traffic Dynamics
• Subscriber Traffic Distribution:
KEY OBSERVATIONS
• Heavy users: Users who generate
as high as 10GB of traffic per day
(10^5 times median).
• Light users: Users who generate
less than 1KB per day.
• CDF shifts left over weekends.
Fig. CDF of total traffic
volume per subscriber per
day.
INFERENCE
• Less traffic on weekends relative
to traffic on working days.
13
Paper 1 - Subscriber Traffic Dynamics
• Subscriber Traffic Distribution (Contd.):
KEY OBSERVATIONS
• 1% of the subscribers create more
than 60% of the daily network
traffic.
• 10% of subscribers create 90% of
the daily network traffic.
Fig. CDF of normalized
traffic volume over the
percentage of subscribers
per day.
INFERENCE
• Imbalance in network usage with
few subscribers (10%) using much
of the network resources.
14
Paper 1 - Subscriber Traffic Dynamics
• Implications of Subscriber Traffic Distribution:
1) An unlimited data plan with flat rate pricing is
not efficient both from the carrier’s
perspective and subscriber’s perspective.
2) CDF graphs shown in previous two slides can
be used to create a ‘tiered’ rate plan for data.
3) Tiered rate plan deals with providing different
pricing options based on data usage.
15
Paper 1 - Subscriber Traffic Dynamics
• Implications of Subscriber Traffic Distribution
(Contd.):
4) To alleviate the problem of high volume
subscribers creating poor experience for other
subscribers, high volume subscribers can be
provided with some incentives.
5) Paper doesn’t consider optimal pricing schemes
based on subscriber usage and network capacity.
It only provides heuristic implications for
subscriber traffic distribution.
16
Paper 1 - Subscriber Traffic Dynamics
• Subscriber Mobility (i.e. Base Stations Visited):
Fig. CDF of number of
distinct base stations
visited by a subscriber each
day.
KEY OBSERVATIONS
• Distribution similar on weekdays
and different on weekends.
• 60% of users are stationary (i.e.
constrained within a cell) and over
95% of users travel across less than
10 base stations in a day.
• Highly mobile users (who visit
more than 50 distinct base stations
in a day) are about 0.01%.
17
Paper 1 - Subscriber Traffic Dynamics
• Subscriber Mobility (i.e. Base Stations Visited):
INFERENCE
• Tendency of lesser degree of
mobility on weekends.
• In terms of the number of distinct
base stations visited, the overall
mobility is low.
Fig. CDF of number of
distinct base stations
visited by a subscriber each
day.
18
Paper 1 - Subscriber Traffic Dynamics
• Subscriber Mobility (Radius of Gyration):
Fig. CDF of radius of
gyration.
• Radius of Gyration is the linear
size occupied by a subscriber’s
trajectory. Requires certain
duration of time (t) for
computation from subscriber’s
trajectory.
• It is basically a root mean square
value.
• Calculated with respect to the
center of mass point of the user’s
trajectory.
19
Paper 1 - Subscriber Traffic Dynamics
• Subscriber Mobility (Radius of Gyration):
KEY OBSERVATIONS:
• 53% of subscribers are practically
static and almost 98% of the
subscribers have radius of gyration
less than 100 miles.
Fig. CDF of radius of
gyration.
INFERENCE:
• Shows the low level of mobility of
majority of subscribers (half of
them).
20
Paper 1 - Subscriber Traffic Dynamics
• Subscriber Mobility (Radius of Gyration):
Fig. Radius of gyration versus duration
of computation for subscribers
categorized into 4 groups according to
their final rg at the end of seven-day
period.
KEY OBSERVATIONS:
• Radius of gyration on an average
comes to a saturation point in just
few days (based on no. of hours).
Saturation indicates that some sort
of boundary in the movement area
has been reached. Quick saturation
measured in terms of ‘return
probability’ in next slide.
• Users with larger radius of
gyration need longer time to
21
saturate.
Paper 1 - Subscriber Traffic Dynamics
• Subscriber Mobility (Radius of Gyration):
KEY OBSERVATIONS:
• Distribution has peaks at 24th,
48th and 72nd hours.
Fig. Probability distribution of time to
returning to the same location.
INFERENCE:
• Periodic nature of human
mobility with a 24 hour period (like
coming back home) and tendency
to return to the same location
periodically. This infers the
saturation of radius of gyration.
22
Paper 1 - Subscriber Traffic Dynamics
• Subscriber Mobility (Radius of Gyration):
Fig. Probability of finding a subscriber
at different locations that are ranked
on the basis of their frequency of
visits. Shows four categories of
subscribers who visit 5, 10, 30 and 50
distinct base stations.
KEY OBSERVATIONS:
• Location with rank, L = 1 indicates
the most visited base station for a
subscriber.
• Subscribers spend 30% of their
time in the top two preferred
locations.
INFERENCE:
• Subscribers are found at their
favorite location with high
probability even there is high
23
mobility among them.
Paper 1 - Subscriber Traffic Dynamics
• Inferences on Subscriber Mobility so far
1) Large fraction of subscribers have limited
mobility (roughly half of them are static
moving within just 1 mile).
2) Subscriber mobility also exhibits periodic
behavior with high probability of returning to
same base station at same time of the day.
3) Overall mobility is predictable.
4) More mobile users tend to generate more
traffic.
24
Paper 1 - Subscriber Traffic Dynamics
• Implications on Subscriber Mobility
1) Idea of caching content and delivering it to
subscribers who exhibit a predictable
mobility behavior - Innovative cloud-based
content delivery applications.
2) Optimizing the location based services and
targeted ad-services through predictable
mobility pattern.
25
Paper 1 - Subscriber Traffic Dynamics
• Relating subscriber mobility and traffic they generate:
Fig. CDF of traffic generated per day
by subscribers based on number of
locations (base stations) visited in a
day.
Fig. CDF of traffic generated per day
by subscribers based on radius of
gyration.
26
Paper 1 - Subscriber Traffic Dynamics
• Relating subscriber mobility and traffic they generate:
KEY OBSERVATIONS FROM PREVIOUS SLIDE:
• Though the plot lines appear similar, they differ in
traffic volume for different number of base stations
visited and traffic volume for different radii of gyration.
INFERENCE:
• More traffic is generated by more subscribers.
• Median traffic generated by subscribers in the highest
mobility category is roughly twice that of the
subscribers in the lowest mobility category.
27
Paper 1 - Subscriber Traffic Dynamics
• Implications relating to subscriber mobility
and traffic they generate:
1) Planning resources dynamically based on
traffic generated by subscribers specific to
subscriber timings of movements.
2) Spectrum management based on timings of
traffic generated and in different cells.
28
Paper 1 - Subscriber Traffic Dynamics
• Subscriber Temporal Activity:
It is the number of days (or hours)
in a week (or in a day), subscribers
generate traffic.
Fig. CDF of number of
hours among peak hours (8
AM to 8 PM) subscribers
generate traffic.
KEY OBSERVATIONS
• About 28% of the subscribers
generate traffic only in single hour
during the peak hours.
• A typical subscriber (i.e. median)
is active in the 4 different hours
during the peak hours. (Consider a
straight line -50% line- across the
29
graph)
Paper 1 - Subscriber Traffic Dynamics
• Subscriber Temporal Activity:
INFERENCE:
• Large fraction of subscribers
generate traffic only in few hours
within a day.
• That is, more of number of
subscribers generating traffic is for
a lesser duration of time (for the
week / for a day).
Fig. CDF of number of
hours among peak hours (8
AM to 8 PM) subscribers
generate traffic.
30
Paper 1 - Subscriber Traffic Dynamics
• Subscriber Temporal Activity:
Airtime: Amount of time a
subscriber holds onto a radio
channel regardless of whether it
communicates or not.
Fig. CDF of airtime among
subscribers.
KEY OBSERVATIONS:
• Median usage is about 100 sec.
• For all 24 hrs (86,400 sec), very
few i.e. less than 1% of subscribers
use the radio channel.
• Weekend usage again lower
compared to weekday usage.
31
Paper 1 - Subscriber Traffic Dynamics
• Relating subscriber temporal activity and traffic they generate:
KEY OBSERVATIONS:
• A typical heavy user appears in 4
to 6 different hours during peak
hours in the days they generate
traffic.
Fig. CDF of occurrence for
heavy users (within top
5000 in atleast one day in
the week with regard to
traffic) in peak hours.
INFERENCE:
• Most heavy users are actually
quite sporadic in traffic
generation and not habitual.
32
Paper 1 - Subscriber Traffic Dynamics
• Relating subscriber temporal activity and traffic they generate:
Effective bit rate is the ratio of
amount of actual traffic generated
by the subscribers to the airtime.
Metric for efficient radio channel
use.
Fig. CDF of effective bit
rate for subscribers
categorized by traffic
generated per day.
KEY OBSERVATIONS:
• Subscribers generating less traffic
(<= 30 KB) have poorer effective bit
rate compared to more traffic
ones. May be due to the kind of
application they use (next slide).33
Paper 1 - Subscriber Traffic Dynamics
• Relating subscriber temporal activity and traffic they generate:
Fig. Effective bit rate for
popular TCP based
applications.
KEY OBSERVATIONS:
• P2P and http:yahoo have the best
channel efficiencies.
• VPN, https and http for Google,
Microsoft have poorest
efficiencies.
INFERENCE:
• Enterprise applications generate
less traffic compared to other
applications for the same airtime.
• All applications have significantly
poorer effective bit rates compared
34
to nominal rates (phy channel).
Paper 1 - Subscriber Traffic Dynamics
• Relating subscriber temporal activity and traffic they generate:
Fig. Effective bit rate for
popular TCP based
applications.
REASONING for INFERENCE:
• Enterprise applications (VPN)
tend to use network sporadically
like keep-alive messages and
typically not high throughput
applications.
• Considering dormancy/sleep
modes, effective bit rate is poor for
VPN-like applications.
• High throughput applications like
P2P use the channel better.
35
Paper 1 - Subscriber Traffic Dynamics
• Implications on effective bit rates:
1) Inefficiency in the usage of the radio
channel airtime drives the need for an
innovative protocol to use wireless channel
efficiently.
2) Inefficiency arises because of wiredinternet protocols used to access wireless
channel and hence better network
protocols need to be designed.
36
BASE STATION TRAFFIC DYNAMICS
We focus on network behavior as a whole or in terms of network components (base
stations) instead of focusing on subscribers.
•
•
•
•
Aggregate Load
Base Station Load Distribution
Spatial Characteristics
Temporal Characteristics
– Load
– Auto-correlation
• Spatiotemporal Characteristics
37
BASE STATION TRAFFIC DYNAMICS - Contd.
Aggregate Load:
• Total traffic split into upload
and download for each day of
the week.
• Favorite weekends see a lesser
load
• Downloads dominate relative
to uploads with more than
75% of daily load coming from
download traffic
38
BASE STATION TRAFFIC DYNAMICS- Contd.
Aggregate Load:
Load on the network is
relatively low in the early
morning hours, and roughly
similar during the day and
the evening.
39
BASE STATION TRAFFIC DYNAMICS- Contd.
Base Station Load Distribution: Volume of daily traffic load for each
base station
80% of the base stations are loaded in the
range of 1- 100MB per day and 10% of the
base stations are highly loaded (more than
100MB per day).
• shows the CDF of daily base station loads
normalized by the total network load.
• 10% of the base stations experience
roughly about 50-60% of the aggregate
traffic load.
In both cases, weekend behavior is slightly different than weekday behavior. The load
imbalance seems more pronounced in weekends. Great imbalance of the base station loads
indicates that a more careful cell planning is possibly needed. Network providers
may use smaller cells or microcells at the hotspots to even out the imbalance.
40
BASE STATION TRAFFIC DYNAMICS- Contd.
Spatial Characteristics
• Goal is to identify whether or how much spatially
correlated the network load is.
• Estimates can potentially help the provider to allocate
resources appropriately.
• Use of Voronoi cells to conduct the experiments
• Voronoi cell corresponds to the geographic region of each
base station’s coverage.
E.g. 10 shops in a flat city and their Voronoi
cells
41
BASE STATION TRAFFIC DYNAMICS- Contd.
More on Voronoi cells:
• Voronoi cells in certain areas (city centers)
signifying some degree of cell planning.
• We can readily see again that the cells are not
uniformly loaded in space. The load
differentials can extend several orders of
magnitude.
Region1
• There does appear to be some degree of
negative correlation between the Voronoi cell
size and load.
• Large Voronoi cells mean sparsely located base
stations, implying sparer population density.
No significant spatial correlation between
adjacent cells is observed via visual inspection
of similar plots for all days.
Region2
42
BASE STATION TRAFFIC DYNAMICS- Contd.
Temporal Characteristics: correlation or predictable relationship between signals
observed at different moments in time.
1. Load:
• Hourly aggregate load of the entire
network and highly loaded base
stations.
• Aggregate network load exhibits a nice
periodic behavior with relatively high
loads during the day and the lowest
load during midnight.
• Individual base station loads do not
show that much periodicity.
• load curve varies significantly among
individual base stations with their
peaks occurring at different times of
the day.
43
BASE STATION TRAFFIC DYNAMICS- Contd.
Auto-correlation:
• Rigorous analysis of the periodic behavior describing the network
load is done using temporal correlation for a load metric.
• Helps in understanding the underlying trends and seasonal variations better.
• Auto-correlation function of the time
series
at different lags.
• Notice the plot shows a high degree of
temporal correlation.
• High peaks occur at 24 hour intervals
and low peaks at 12 hour intervals.
• Isn’t this consistent with diurnal human
activity patterns. 
44
BASE STATION TRAFFIC DYNAMICS- Contd.
Spatiotemporal Characteristics:
• Use of Moran I to investigate spatial behavior.
• Moran's I is a measure of spatial auto correction.
• Spatial autocorrelation is characterized by a correlation in a signal among
nearby locations in space. Spatial autocorrelation is more complex than onedimensional autocorrelation because spatial correlation is multi-dimensional
(i.e. 2 or 3 dimensions of space) and multi-directional.
• It’s defined as
𝑥 is the is the hourly load on a base station(random variable).
--𝑥(x bar) mean of x
𝑥𝑖 ’s are the observations. 𝑤𝑖𝑗 is the weight associated with each pair (𝑥𝑖,
𝑥𝑗)
𝑁 is the number of observations.
45
BASE STATION TRAFFIC DYNAMICS- Contd.
More on Moran I:
• Binary weights: 𝑤𝑖𝑗 = 1, when the base stations are in close
proximity (a threshold of 2 miles is used), else 𝑤𝑖𝑗 = 0.
• Moran’s I metric is plotted for hourly loads of all base stations in
the network on a temporal scale.
• Periodic behavior with a diurnal cycle is
interesting.
• Appears that while temporal usage patterns
of base stations may be very different
and might even miss periodicity there is a
general tendency for proximate base
station loads to be more correlated when
the loads are high.
• Correlation is fairly small, rarely exceeding
0.15.
• Min close to zero, showing almost
independent loading behavior around
midnights when generally the loads are
46
small.
Implication of variability in Base station Load
• High degree of variability in base station loads has important
implication on spectrum allocation and energy saving schemes in
the network.
• Adaptively turning on/off certain carriers or radios in base stations
based on the load experienced need to be developed.
• Peak hours of different cells vary a lot
• Dynamic allocation of spectrum resources to highly loaded cells
during their peak hours
• Future Work: model the demand characteristics on different cells in
cellular data networks based on measurements for a long period of
time and feed the model as inputs to dynamic spectrum allocation
algorithms. Study the observation
47
Paper 2 – NetPiculet – Untold Story of middleboxes
•
Cellular networks becoming more and more ubiquitous and \
pervasive.
•
Two major players involved in such networks –
- Network providers
- Application developers
• Cellular Networks also face problems similar to their Internet
counterparts such as IP address space depletion and security
loopholes
• Moreover cellular networks have limited resources
• To make best use of their limited resources, number of middleboxes
deployed by providers to enforce policies
48
NetPiculet
•
An Android Application opened to mret place in January 2011
in order to record policies
•
Major policies tested are NAT and Firewall
• Tested over 6 continents and 107 different carriers.
• Made lucrative by making the user know his network shortcomings
and loopholes
49
NetPiculet - System Architecture
50
NAT traversal
•
•
NAT traversal is a general term for
techniques that establish and
maintain Internet protocol
connections traversing a NAT
gateway
IPv4 address space depleted and
number of users increasing.
• Also allows hiding of end clients
behind NAT routers and thus
increases security
• Many filtering policies implemented
at NAT gateways which was the aim of
NetPiculet to find out.
51
NAT mapping schemes
•
NAT middlebox maps an external endpoint based on the TCP 5
tuple (protocol, local-addr, local-process, foreign-addr, foreignprocess)
•
Mapping can be any one of the following:- Independent :- external endpoint remains the same
- Address and Port(delta) – external endpoint changes when
destination endpoint changes
- Connection(delta) – External endpoint changes for each new
connection
[delta – increment in external port number for every new connection]
•
Port number predicted in order to test with stream of packets for
new connections.
52
NAT Policies
•
Nat properties:- End point filtering
- TCP state tracking
- Filtering Response
- Packet mangling
• NAT characteristics:- Time dependent NAT mapping – has advantages as well as
disadvantages and hence a compromised value has to be
decided depending upon tradeoff
- Multiple NAT boxes – system complexity increases
53
Summary of results of NAT Policies
•
Discovered a previously unknown NAT mapping scheme and
implemented a corresponding traversal scheme which succeeds
with high probability.
•
A single client may encounter multiple NAT boxes due to load
balancing and hence care should be taken to maintain mapping
during the traversal.
• Some of the carriers assign random ports for connections which is
worst for NAT mapping and traversal. Birthday paradox used to
resolve the mapping but for P2P applications, it is better to use a
consistent mapping scheme.
54
Firewall
•
Required to protect end users from malicious attacks such as DoS,
Battery drain-out, etc
•
Implemented at middleboxes inline with NAT.
• Methodology used for testing:- Testing IP spoofing
- Testing stateful Firewall
- Testing TCP connection timeout
- Testing Out-of-order Packet Buffering
55
Firewall Policies
56
Firewall Policies – Implications and Recommendations
• Energy impact of TCP connection timeout
• Performance and Energy impact of buffering- Disabling TCP fast retransmit
- Bad interaction with Protect against wrapped sequence
- Bad interaction with TCP Forward- RTO recovery
• Exploiting large sequence number window
• Flaws with closing TCP connections
57
Firewall Policies – Effect on Download time
58
Firewall Policies – Effect on Energy Consumption
59
Summary of Firewall Policies
• 4 out of 60 cellular networks allow IP spoofing making the user
vulnerable
• Nearly 15 % of carriers set TCP timeout less than 10 minutes
increasing energy consumption. SDK suggested to be used by
developers to maintain uniformity.
• TCP out of order buffering causes degraded performance and energy
waste in some cases. So a tradeoff has to be decided between
performance and security.
60
Download