Trace-driven Context-aware Mobile Networks Towards Mobile Social Networks Ahmed Helmy Computer and Information Science and Engineering (CISE) Dept Mobile Networking Lab (NOMADS group) University of Florida helmy@ufl.edu http://nile.cise.ufl.edu/MobiLib Birds-Eye View: Mobile & Networking Lab Architecture & Protocol Design Protocol Independent Multicast (PIM) Robust Geographic Wireless Services (Geo-Routing, Geocast, Rendezvous) Query Resolution in Wireless Networks (ACQUIRE & Contacts) Gradient Routing (RUGGED) Multicast-based Mobility (M&M) Worms, Traceback in Mobile Networks Mobility-Assisted Protocols (MAID) Socially-Aware Networks Methodology & Tools Network Simulator (NS-2) Test Synthesis (STRESS) Protocol Block Analysis (BRICS) Mobility Modeling (IMPORTANT, TVC) Behavioral Analysis in Wireless Networks (IMPACT & MobiLib) Introduction & Problem Scope • Future network devices are mobile & ‘personal’ – Very tight coupling between devices & humans • Network performance significantly affected by (and affects) users behavior: – Movement, grouping, on-line activity, trust, cooperation… • How do users behave in mobile societies? • What kinds of protocols/networks survive & perform well in highly mobile societies? Paradigm Shift in Protocol Design Used to: Design general purpose protocols Evaluate using models (random mobility, traffic, …) Modify to improve performance and failures for specific context – May end up with suboptimal performance or failures due to lack of context in the design Propose to: Analyze, model deployment context Design ‘application class’-specific parameterized protocols Utilize insights from context analysis to fine-tune protocol parameters Problem Statement • How to gain insight into deployment context? • How to utilize insight to design future services? Approach • Extensive trace-based analysis to identify dominant trends & characteristics • Analyze user behavioral patterns – Individual user behavior and mobility – Collective user behavior: grouping, encounters • Integrate findings in modeling and protocol design – I. User mobility modeling – II. Behavioral grouping – III. Information dissemination in mobile societies, profile-cast The TRACE framework MobiLib Trace Characterize (Cluster) x1,1 x1,n xt ,1 xt ,n Represent Analyze Employ (Modeling & Protocol Design) Vision: Community-wide Wireless/Mobility Library • Library of – Measurements from Universities, vehicular networks – Realistic models of behavior (mobility, traffic, friendship, encounters) – Benchmarks for simulation and evaluation – Tools for trace data mining • Use insights to design future context-aware protocols? • http://nile.cise.ufl.edu/MobiLib Trace Libraries of Wireless Traces • Multi-campus (community-wide) traces: – MobiLib (USC (04-06), now @ UFL) • nile.cise.ufl.edu/MobiLib • 15+ Traces from: USC, Dartmouth, MIT, UCSD, UCSB, UNC, UMass, GATech, Cambridge, UFL, … • Tools for mobility modeling (IMPORTANT, TVC), data mining – CRAWDAD (Dartmouth) • Types of traces: – – – – – University Campus (mainly WLANs) Conference AP and encounter traces Municipal (off-campus) wireless Bus & vehicular wireless networks Others … (on going) Trace Wireless Networks and Mobility Measurements • In our case studies we use WLAN traces – From University campuses & corporate networks (4 universities, 1 corporate network) – The largest data sets about wireless network users available to date (# users / lengths) – No bias: not “special-purpose”, data from all users in the network • We also analyze – Vehicular movement trace (Cab-spotting) – Human encounter trace (at Infocom Conf) Trace Case study I – Individual mobility T ra c e s O b se rv a tio n A p p lic a tio n In d iv id u a l u se r m o b ility M o b ility m odel M ic ro sc o p ic b e h a v io r U se r g ro u p s in th e p o p u la tio n E n c o u n te r p a tte rn s in th e n e tw o rk P ro file -c a st p ro to c o l S m a llW o rld b a se d m e ssa g e d isse m in a tio n M a c ro sc o p ic b e h a v io r Case Study I: Goal • To understand the mobility/usage pattern of individual wireless network users • To observe how environments/user type/tracecollection techniques impact the observations • To propose a realistic mobility model based on empirical observations – That is mathematically tractable – That is capable of characterizing multiple classes of mobility scenarios IMPACT: Investigation of Mobile-user Patterns Across University Campuses using WLAN Trace Analysis* - 4 major campuses – 30 day traces studied from 2+ years of traces - Total users > 12,000 users - Total Access Points > 1,300 Trace source Trace duration User type Environment Collection method Analyzed part MIT 7/20/02 – 8/17/02 Generic 3 corporate buildings Polling Whole trace Dartmouth 4/01/01 – 6/30/04 Generic w/ subgroup University campus Event-based July ’03 April ’04 UCSD 9/22/02 – 12/8/02 PDA only University campus Polling USC 4/20/05 – 3/31/06 Generic University campus Event-based 04/20/0505/19/05 (Bldg) 09/22/0210/21/02 • Understand changes of user association behavior w.r.t. – Time - Environment - Device type - Trace collection method * W. Hsu, A. Helmy, “IMPACT: Investigation of Mobile-user Patterns Across University Campuses using WLAN Trace Analysis”, two papers at IEEE Wireless Networks Measurements (WiNMee), April 2006 Metrics for Individual Mobility Analysis • What kind of spatial preference do users exhibit? – The percentile of time spent at the most frequently visited locations • What kind of temporal repetition do users exhibit? – The probability of re-appearance • How often are the nodes present? – Percentage of “online” time x1,1 x1,n xt ,1 xt ,n Represent Fraction of online time associated with the AP Prob.(coverage > x) Observations: Visited Access Points (APs) CCDF of coverage of users [percentage of visited APs] Average fraction of time a MN associates with APs •Individual users access only a very small portion of APs in the network. •On average a user spends more than 95% of time at its top 5 most visited APs. •Long-term mobility is highly skewed in terms of time associated with each AP. •Users exhibit “on”/”off” behavior that needs to be modeled. Repetitive Behavior •Clear repetitive patterns of association in wireless network users. •Typically, user association patterns show the strongest repetitive pattern at time gap of one day/one week. Mobility Characteristics from WLANs Prob.(online time fraction > x) • Simple existing models are very different from the characteristics in WLAN On/off activity pattern Skewed location preference Periodic re-appearance Characterize Mobility Models • Mobility models are of crucial importance for the evaluation of wireless mobile networks [IMP03] • Requirements for mobility models – Realism (detailed behavior from traces) – Parameterized, tunable behavior – Mathematical tractability • Related work on mobility modeling – Random models (Random walk/waypoint): inadequate for human mobility – Improved synthetic models (pathway model, RPGM, WWP, FWY, MH) – more realistic, difficult to analyze – Trace-based model (T/T++): trace-specific, not general Time-variant Community (TVC) Model (W. Hsu, Thyro, K. Psounis, A. Helmy, “Modeling Time-variant User Mobility in Wireless Mobile Networks”, IEEE INFOCOM, 2007, Trans. on Networking) • Skewed location visiting preference – Create “communities” to be the preferred area of movement – Each node can have its own community • Node moves with two different epoch types – Local or roaming – Each epoch is a random-direction, straight-line movement – Local epochs in the community – Roaming epochs around the whole simulation area L 75% 25% Employ Tiered Time-variant Community (TVC) Model • Periodical re-appearance – Create structure in time – Periods – Node moves with different parameters in periods to capture time-dependent mobility – Repetitive structure TP1 TP2 TP3 TP1 TP2 TP3 Repetitive time period structure • Finer granularity in space & time Time period 1 (TP1) – Multi-tier communities – Multiple time periods C o m m 11 Time period 2 (TP2) Time Time period 3 (TP3) C o m m 13 C o m m 21 C o m m 22 C o m m 1 C o m m 23 2 C o m m 33 C o m m 31 C o m m 43 Employ Using the TVC Model – Reproducing Mobility Characteristics • (STEP1) Identify the popular locations; assign communities • (STEP2) Assign parameters to the communities according to stats • (STEP3) Add user on-off patterns (e.g., in WLAN, users are usually off when moving) Using the TVC Model – Reproducing Mobility Characteristics • WLAN trace (example: MIT trace) A P sorted by total am ou n t of tim e associated w ith it 11 21 31 41 51 61 71 81 91 1 .E + 0 0 1 .E -0 1 1 .E -0 2 M odel-sim plified 1 .E -0 3 1 .E -0 4 1 .E -0 5 M IT M odel-com plex 1 .E -0 6 0 .3 Prob.(Node re-appear at the same AP after the time gap) Average fraction of online time associated with the AP 1 0 .2 5 0 .2 M odel-sim plified M IT 0 .1 5 0 .1 0 .0 5 M odel-com plex 0 0 Skewed location visiting preference 2 4 T im e g ap (d ay s) 6 Periodic re-appearance * Model-simplified: single community per node. Model-complex: multiple communities ** Similar matches achieved for USC and Dartmouth traces 8 Using the TVC Model – Reproducing Mobility Characteristics • Vehicular trace (Cab-spotting) L oca tion sorte d b y tota l a m ou n t of tim e a ssocia te d w ith it 1 11 21 31 41 51 61 71 81 91 1 .E -0 1 1 .E -0 2 1 .E -0 3 1 .E -0 4 1 .E -0 5 1 .E -0 6 M od e l V e h icle -tra ce 0 .3 Prob.(Node re-appear at the same location after the time gap) Average fraction of online time associated with the location 1 .E + 0 0 0 .2 5 M odel 0 .2 0 .1 5 V eh icle-trace 0 .1 0 .0 5 0 0 2 4 T im e g ap (d ay s) 6 8 Using the TVC Model – Reproducing Mobility Characteristics • Human encounter trace at a conference 10 100 In te r-m e e tin g tim e (s) 1000 10000 100000 M e e tin g d u ra tio n (s) 1 1000000 0 .1 10 100 1000 10000 1 M odel 0 .0 1 C a m b rid g e -IN F O C O M -tra c e 0 .0 0 1 0 .0 0 0 1 Prob(Meeting duration > X) Prob(Inter-meeting time > X) 1 0 .1 M odel 0 .0 1 C a m b rid g e -IN F O C O M -tra c e 0 .0 0 1 1 6 h o u rs 0 .0 0 0 1 0 .0 0 0 0 1 Inter-meeting time A encounters B Encounter Inter-meeting duration time Encounter duration time 100000 Case study II – Groups in WLAN T ra c e s O b s e rv a tio n A p p lic a tio n In d iv id u a l u s e r m o b ility M o b ility m odel M ic ro s c o p ic b e h a v io r U s e r g ro u p s in th e p o p u la tio n E n c o u n te r p a tte rn s in th e n e tw o rk P ro file -c a s t p ro to c o l S m a llW o rld based m essage d is s e m in a tio n M a c ro s c o p ic b e h a v io r Case Study II: Goal • Identify similar users (in terms of long run mobility preferences) from the diverse WLAN user population – Understand the constituents of the population – Identify potential groups for group-aware service • In this case study we classify users based on their mobility trends (or location-visiting preferences) – We consider semester-long USC trace (spring 2006, 94days) and quarter-long Dartmouth trace (spring 2004, 61 days) Representation of User Association Patterns • We choose to represent summary of user association in each day by a single vector – a = {aj : fraction of online time user i spends at APj on day d} -Office, 10AM -12PM -Library, 3PM – 4PM -Class, 6PM – 8PM Association vector: (library, office, class) =(0.2, 0.4, 0.4) • Summarize the long-run mobility in an E ach row represents an “association matrix” association vector for a tim e slot x 1 ,1 x 1 , 2 x A n entry represents the 2 ,1 percentage of online tim e during tim e slot i at location j x t ,1 xi, j x1 , n x t , n E ach colum n represents the popularity for a location across tim e x1,1 x1,n xt ,1 xt ,n Represent Eigen-behavior • Eigen-behaviors: The vectors that describe the maximum remaining power in the association matrix (obtained through Singular Value Decompostion) k 1 u1 arg max X u , uk arg max ( X Xui ui' ) u k 2 u 1 u 1 i 1 wk i 1 i2 with quantifiable importance • Eigen-behavior Distance calculates similarity of users by weighted inner products of eigen-behaviors. – Sim(U ,V ) wi w j ui v j i , j • Assoc. patterns can be re-constructed with low rank & error • Benefits: Reduced computation and noise 2 k Rank( X ) Similarity-based User Classification • With the distance between users U and V defined as 1-Sim(U,V), we use hierarchical clustering to find similar user groups. In te r-g ro u p In tra -g ro u p S e rie s3 S e rie s4 1 USC 1 0 .6 0 .4 A M V D E ig e n -b e h a v io r d ista n c e 0 .4 AMVD 0 .2 Dartmouth 0 .8 E ig e n -b e h a v io r d ista n c e CDF CDF 0 .8 0 .6 In te r-g ro u p In tra-g ro u p S e ries3 S e ries4 0 .2 0 0 0 0 .2 0 .4 0 .6 0 .8 D ista n c e b e tw e e n u se rs 1 0 *AMVD = Average Minimum Vector Distance 0 .2 0 .4 0 .6 0 .8 D ista n c e b e tw e e n u se rs 1 Validation of User Groups • Significance of the groups – users in the same group are indeed much more similar to each other than randomly formed groups (0.93 v.s. 0.46 for USC, 0.91 v.s. 0.42 for Dartmouth) • Uniqueness of the groups – the most important group eigen-behavior is important for its own group but not other groups Significance score of top eigen-behavior for USC Dartmouth Its own group 0.779 0.727 Other groups 0.005 0.004 User Groups in WLAN - Observations Group size • Identified hundreds of distinct groups of similar users • Skewed group size distribution – the largest 10 groups account for more than 30% of population on campus. Power-law distributed group sizes. • Most groups can be described by a list of locations with a clear ordering of importance • We also observe groups visiting multiple locations with similar importance – taking the most important location for each user is not sufficient 1000 D a rtm o u th 5 4 0 * x ^ -0 .6 7 USC 5 0 0 * x ^ -0 .7 5 100 10 1 1 10 100 U se r g ro u p siz e ra n k 1000 Case study III – Encounter Patterns T ra c e s O b s e rv a tio n A p p lic a tio n In d iv id u a l u s e r m o b ility M o b ility m odel M ic ro s c o p ic b e h a v io r U s e r g ro u p s in th e p o p u la tio n E n c o u n te r p a tte rn s in th e n e tw o rk P ro file -c a s t p ro to c o l S m a llW o rld based m essage d is s e m in a tio n M a c ro s c o p ic b e h a v io r Case Study III: Goal • Understand inter-node encounter patterns from a global perspective – How do we represent encounter patterns? – How do the encounter patterns influence network connectivity and communication protocols? • Encounter definition: – In WLAN: When two mobile nodes access the same AP at the same time they have an ‘encounter’ – In DTN: When two mobile nodes move within communication range they have an ‘encounter’ 0 Fraction of user population (x) 0.4 0.6 0.2 0.8 1 1 Cambridge 0.1 UCSD MIT USC 0.01 Dart-04 0.001 0.0001 Prob. (total encounter events > x) Prob. (unique encounter fraction > x) Observations: Encounters Dart-03 CCDF of unique encounter count CCDF of total encounter count •In all the traces, the MNs encounter a small fraction of the user population. • A user encounters 1.8%-6% on average of the user population (except UCSD) •The number of total encounters for the users follows a BiPareto distribution. Encounter-Relationship (ER) graph • Draw a link to connect a pair of nodes if they ever encounter with each other … Analyze the graph properties? Group of good friends… x1,1 x1,n xt ,1 xt ,n Cliques with random links to join them Represent Small Worlds of Encounters Regular graph Normalized CC and PL • Encounter graph: nodes as vertices and edges link all vertices that encounter Clustering Coefficient (CC) Small World Av. Path Length Random graph • The encounter graph is a Small World graph (high CC, low PL) • Even for short time period (1 day) its metrics (CC, PL) almost saturate Background: Delay Tolerant Networks (DTN) • DTNs are mobile networks with sparse, intermittent nodal connectivity • Encounter events provide the communication opportunities among nodes • Messages are stored and moved across the network with nodal mobility A B C Information Diffusion in DTNs via Encounters • Epidemic routing (spatio-temporal broadcast) achieves almost complete delivery Trace duration = 15 days Unreachable ratio (Fig: USC) Robust to the removal of short encounters Robust to selfish nodes (up to ~40%) Encounter-graphs using Friends • Distribution for friendship index FI is exponential for all the traces • Friendship between MNs is highly asymmetric • Among all node pairs: < 5% with FI > 0.01, and <1% with FI > 0.4 •Top-ranked friends form cliques and low-ranked friends are key to provide random links (short cuts) to reduce the degree of separation in encounter graph. Profile-cast W. Hsu, D. Dutta, A. Helmy, Mobicom 2007 • Sending messages to others with similar behavior, without knowing their identity – Announcements to users with specific behavior V – Interest-based ads, similarity resource discovery • Assuming DTN-like environment B Is E similar to V? E Is B similar to V? C ? D Is C/D similar to V? A Profile-cast Use Cases • Mobility-based profile-cast – Targeting group of users who move in a particular pattern (lost-and-found, context-aware messages, moviegoers) – Approach: use “similarity metric” between users • Mobility-independent profile-cast – Targeting people with a certain characteristics independent of mobility (classic music lovers) – Approach: use “Small World” encounter patterns Mobility-based Profile-cast Mobility space N SN Forward?? S D D Scoped message spread in the mobility space N N D Profile-cast Operation 1. profiling N N S N – Singular value decomposition • Profiling user mobility provides summary ofnode the – Theamobility of a matrixis(Arepresented few eigen-behavior by an vectors are sufficient, e.g. for association matrix 99% of users at most 7 vectors describe 90% of power in the x x association matrix) x 1 ,1 N Each row represents an association vector for time slot a entry represents x 2 ,1 x t ,1 1, 2 1, n x ,j Sum. i vectors x t , n An the percentage of online time during time slot i at location j Profile-cast Operation 1. profiling N N S N • Determining user similarity – S sends Eigen behaviors for the virtual profile to N – N evaluated the similarity by weighted inner products of Eigen-behaviors Sim(U ,V ) wi w j ui v j 2. Forwarding decision N i , j – Message forwarded if Sim(U,V) is high (the goal is to deliver messages to nodes with similar profile) – Privacy conserving: N and S do not send information about their own behavior Profile-cast Evaluation • Epidemic: Near perfect delivery ratio, low delay, high overhead • Centralized: Near perfect delivery ratio, low overhead, a bit extra delay • Decentral: provides tradeoff between delivery & overhead • Random: poor delivery ratio Epidemic Decentral Decentral Decentral Random Random Random Random - Decentralized I-cast achieves: > 50% reduction in overhead of Epidemic >30% increase in delivery of Random * Results presented as the ratio to epidemic routing Evaluation - Result Success Rate Delay Overhead Flooding Centralized Similarity 0.7 Similarity 0.6 Similarity 0.5 RTx m=1 TTL=inf. RTx m=3 TTL=inf. RTx m=6 TTL=inf. RTx m=9 TTL=inf. RTx m=inf. TTL=1 RTx m=inf. TTL=5 0 0.2 0.4 • Centralized: Excellent success 92% rate with only 3% overhead. 3.13 • Similarity-based: 45% (1) 61% success rate at low 2.56 overhead, 92% success rate more overhead 2.06 at 45% overhead (2) A flexible success rate – 1.80 overhead tradeoff 2.11 • RTx with infinite TTL: Much 1.47 more overhead undersimilar success rate • Short RTx with many copies: 0.6 0.8 1 1.2 1.4 Good success rate/overhead, but delay is still long Profile-cast Initial Results • Adjustable overhead/delivery rate tradeoff – 61% delivery rate of flooding with 3% overhead – 92% delivery rate with 45% overhead • Better than single random walk in terms of delay, delivery rate • Multiple short random walks also work well in this case Future Work • Sending to a mobility profile specified by the sender – Gradient ascend followed by similarity comparison (in the mobility space) • Mobility independent profile-cast – The encounter pattern provides a network in which most nodes are reachable – We don’t want to flood – How to leverage the Small World encounter pattern to reach the “neighborhood” of most nodes efficiently? Future Work – One-copy-per-clique in the “mobility space” - D iffe re n t le g e n d s re p re se n t n o d e s w ith d iffe re n t m o b ility tre n d s -W h ite n o d e s d e n o te th e ta rg e t re c ip ie n ts S S S In te re st sp a c e M o b ility sp a c e P h y sic a l sp a c e – We expect this to work because similarity in mobility leads to frequent encounters 0 .7 0 .6 Encounter Ratio 0 .5 0 .4 0 .3 0 .2 0 .1 0 0 0 .2 0 .4 0 .6 U se r p a ir sim ila rity 0 .8 1 Future Directions (Applications) • Detect abnormal user behavior & access patterns based on previous profiles • Behavior aware push/caching services (targeted ads, events of interest, announcements) • Caching based on behavioral prediction • Can/should we extend this paradigm to include social aspects (trust, friendship, …)? • Privacy issues and mobile k-anonymity On Mobility & Predictability of VoIP & WLAN Users J. Kim, Y. Du, M. Chen, A. Helmy, Crawdad 2007 Work in-progress Markov O(2) Predictor Accuracy VoIP User Prediction Accuracy -VoIP users are highly mobile and exhibit dramatic difference in behavior than WLAN users -Prediction accuracy drops from ave ~62% for WLAN users to below 25% for VoIP users Motivates -Revisiting mobility modeling -Revisiting mobility prediction Gender-based feature analysis in Campus-wide WLANs U. Kumar, N. Yadav, A. Helmy, Mobicom 2007, Crawdad 2007 3500 Percentage Users 35 3000 M… 20 2000 15 1500 visitors 1000 10 500 0 25 University Campus traces traces Area 5 0 Intel Apple Gem… Ente… Links… ASKE… D-Link Ager… Netg… Average Duration (sec) 2500 Ma le 30 Manufacturer - Able to classify users by gender using knowledge of campus map -Users exhibit distinct on-line behavior, preference of device and mobility based on gender -On-going Work -How much more can we know? -What is the “information-privacy trade-off”? The Next Generation (Boundless) Classroom Students sensor sensor sensor sensor sensor sensor-adhoc Embedded sensor network WLAN/adhoc WLAN/adhoc sensor sensor sensor Multi-party conference Tele-collaboration tools sensor sensor sensor-adhoc Instructor WLAN/adhoc Challenges sensor sensor sensor sensor sensor-adhoc -Integration of wired Internet, WLANs, Adhoc Mobile and Sensor Networks -Will this paradigm provide better learning experience for the students? Real world group experiments (structural health monitoring) Future Directions: TechnologyHuman Interaction The Next Generation Classroom Emerging Wireless & Multimedia Technologies Protocols, Applications, Services Human Behavior Mobility, Load Dynamics Engineering Multi-Disciplinary Research Human Computer Interaction (HCI) & User Interface Social Sciences Cognitive Sciences Education Psycology Application Development Service Provisioning Emerging Wireless & Multimedia Technologies How to Capture? Protocols, Applications, Services Human Behavior Educational/ Learning Experience Protocol Design How to Evaluate? Measurements Mobility Models Context-aware Networking How to Design? Traffic Models Mobility, Load Dynamics Disaster Relief (Self-Configuring) Networks sensor sensor sensor sensor sensor sensor sensor sensor sensor sensor sensor sensor sensor sensor sensor sensor sensor sensor sensor sensor On-going and Future Directions Utilizing mobility – Controlled mobility scenarios • DakNet, Message Ferries, Info Station – Mobility-Assisted protocols • Mobility-assisted information diffusion: EASE, FRESH, DTN, $100 laptop – Context-aware Networking • Mobility-aware protocols: self-configuring, mobility-adaptive protocols • Socially-aware protocols: security, trust, friendship, associations, small worlds – On-going Projects • Next Generation (Boundless) Classroom • Disaster Relief Self-configuring Survivable Networks Thank you! Ahmed Helmy helmy@ufl.edu URL: www.cise.ufl.edu/~helmy MobiLib: nile.cise.ufl.edu/MobiLib Emerging Wireless Communication • Opportunities • Challenges – Dynamic network structure – Decentralized service paradigm – Tight coupling between the devices and individuals Outline T ra c e s Detailed behavior analysis Complete case O b s e rv a tio n In d iv id u a l u s e r m o b ility U s e r g ro u p s in th e p o p u la tio n E n c o u n te r p a tte rn s in th e n e tw o rk Future Work A p p lic a tio n M o b ility m odel M ic ro s c o p ic b e h a v io r P ro file -c a s t p ro to c o l S m a llW o rld based m essage d is s e m in a tio n M a c ro s c o p ic b e h a v io r Trace Sets • Available information from WLAN traces – MAC addresses of the devices as identifiers – Location/Time of users (our main focus) Node: e0_12_29_fc_ba_8c Association Start time 2197745 2230200 2257917 2285119 2297134 2304287 Location_ID 172.16.8.244_11009 172.16.8.244_11009 172.16.8.244_11009 172.16.8.244_11009 172.16.8.244_11009 172.16.8.244_11023 Duration 4433 13320 643 1017 7153 6744 Trace Summary (Case Study I) • We observe some omni-present mobility characteristics from WLANs. • These characteristics are not captured by existing synthetic mobility models (i.e., hence the models are not realistic) • We propose the Time-variant Community (TVC) model, which is realistic, theoretically tractable, and flexible Theoretical Tractability • For the TVC model, we can derive – Nodal spatial distribution – the demographic profile of the mobility model – Average node degree – important for cluster maintenance and geographic routing – Hitting time/ Meeting time – important for routing performance analysis • With low error when the communication range is small compared to the community sizes (communication disk < 25% of community) Theoretical Tractability 3 M o d el6 -s im M o d el6 -th eo ry M o d el7 -s im M o d el7 -th eo ry M o d el3 -s im M o d el3 -th eo ry 2 .5 0.08 Average Node Degree Prob(node appears at (X,Y)) 0.1 0.06 0.04 0.02 800 600 400 Y 200 0 0 200 400 600 800 2 1 .5 1 0 .5 0 0 0 10 X 50000 30 40 50 15000 45000 M o d el1 -s im M o d el1 -th eo ry M o d el2 -s im M o d el2 -th eo ry M o d el6 (m u lti_ tier)-s im M o d el6 (m u lti_ tier)-th eo ry 35000 30000 25000 M o d el5 (tiered _ co m m )-s im M o d el5 (tiered _ co m m )-th eo ry M o d el7 (m u lti_ co m m )-s im M o d el7 (m u lti_ co m m )-th eo ry M o d el3 -s im M o d el3 -th eo ry 12000 Meeting Time (s) 40000 Hitting time (s) 20 C o m m u n ic a tio n R a n g e (K ) 20000 15000 10000 9000 6000 3000 5000 0 0 10 20 30 40 50 C o m m u n ic a tio n R a n g e (K ) 60 70 10 20 30 40 50 C o m m u n ic a tio n R a n g e (K ) 60 70 Theory Derivation – Hitting Time • Hitting time – the time for a node to move into the communication range of a randomly chosen target coordinate, starting from the stationary distribution (hit) Theory Derivation – Hitting Time 1. Weighted average conditioned on the relative location of the ‘target’ HT HT (target in comm ) Pr( target in comm ) comm i i 2. Calculate the unit-time hitting probability for each scenario Pht Pmove 2(Comm_range )(Ave_spee d) (Community _edge) 2 3. Calculate hitting probability for the whole time period 4. Calculate the conditional hitting time Application II: Trace-based Mobility Modeling • Skewed location preference • Repetitive behavior – Nodes spend 95% of time at top 5 preferred locations. – Heavily visited “preferred spots” – Nodes show up repeatedly at the same location after integer multiples of days. – Periodical “daily/weekly schedules” Similarity-based User Classification: Association-based Representation* (AP2:library, 1:30PM-2:30PM) (AP1:office, 10AM-12PM) (AP3: class, 6PM-8PM) Association vector: (AP1, AP2, AP3) =(0.2, 0.4, 0.4) • For a given day d, user assoc. vector is defined by n-element vector – a = {aj : fraction of online time user i spends at APj on day d} – ‘n’ is the number of APs a a1 a2 an – Use zero vector for off-line users • Vector elements quantify relative attraction of AP to user • User Association Consistency – User i is ‘consistent’ if daily assoc. vectors can be grouped into few clusters n – Use clustering with Manhattan distance measure D(a, b) ai bi i 1 * W. Hsu, D. Dutta, A. Helmy, Mobicom 2007 Summarizing user associations • Association matrix: concatenate user association vectors for all days into a matrix. • To summarize, perform SVD and store the top-k eigen values/vectors. • What value of k we have to use for a good representation of the matrix? k – Captured matrix power = i i i 2 d i i 2 i , i i - th singular v alue • How much is the reconstruction error? – Matrix norms ||X-Xk||p/||X||p where p X p p X ( i , j ) (i , j ) Daily association vector Summarizing user associations Trace % users # vectors (rank) power USC (Bldg) 95% 6 90% Dartmouth (AP) 92% 6 90% Dartmouth (Bldg) 94% 4 90% * Matrix reconstruction error < 5% Assoc. patterns can be re-constructed with low rank and low error Clustering Users with Similar Behavior • Exhaustive comparison of assoc. vectors: – Find average of |ajd - aid| over all days d for all i,j pairs – Drawback: O(nd2) for each pair • Compare similarity of eigen-vectors obtained from SVD • Use weighted inner products of eigen vectors U, V – , – – wui = proportion of power of SV – D(U,V) = 1 - Sim(U,V) – Corr > 91% with exhaustive Can achieve very good clustering efficiently using distributed computation A handful of eigen-vectors can capture most of the behavior power • Derived from simultaneous associations to the same locations Prob. (unique encounter fraction > x) Encounter Events • How many other nodes does a node encounter with? 0.5 On avg. only 2%~7% of population Encounter-Relationship graph Disconnected Ratio (%) • To our surprise, disconnected pairs of nodes are low!! Summary (Case Study II) • We use SVD to obtain eigen-behaviors of individual users. • We use the eigen-behavior distances and hierarchical clustering to classify WLAN users into similar groups. • This finding is useful for mobility modeling (identifying group sizes and their frequently visited locations), network management, abnormality detection, and group-aware protocol (i.e., profile-cast, our future work) Summary (Case Study III) • The distribution of encounters in real WLAN trace is very different from synthetic models • The encounter-relationship graph displays SmallWorld characteristics • Despite a low encounter ratio of the whole population, the encounter events lead to a robust, reachable network (with long delay). Future Work – Profile-cast T ra c e s O b s e rv a tio n A p p lic a tio n In d iv id u a l u s e r m o b ility M o b ility m odel M ic ro s c o p ic b e h a v io r U s e r g ro u p s in th e p o p u la tio n E n c o u n te r p a tte rn s in th e n e tw o rk P ro file -c a s t p ro to c o l S m a llW o rld based m essage d is s e m in a tio n M a c ro s c o p ic b e h a v io r Goal • To send messages to a group of nodes within the general population – The group is defined by the intrinsic behavior patterns of the nodes (CISE students, library visitors, moviegoers) – The sender does not know the network identities (addresses) of the destinations • Different from multi-cast: No join/leave, no group maintenance Largest number of female users is in social sciences and is much higher than the male WLAN users in those buildings. Female users are surprisingly high (vs males) in the first 2 samples. WLAN activity was down Feb 07 due to lower enrollment in Spring and potential changes in the network. Females in social, economic, admin and comm/journalism generally have longer session durations than males in those majors. In Engineering, music and chemistry the opposite is true. Session durations are decreasing indicating potential increase in mobility. Apple consistently more popular in females than males Intel (PCs) are more popular in males than females Increase in use of Apple and Intel in general, and degradation in other brands Mobility Profile-cast (intra-group) Goal Flooding S Flood-sim S Single long random walk S S Multiple short random walks S Mobility Profile-cast (inter-group) Goal Flooding S T.P. S T.P. Gradient-ascend S T.P. Single long random walk S T.P. Flooding_sim Multiple short random walks S T.P. S T.P. Performance Comparison 1.2 1 Gradient ascend helps Success rate - Large groups to overcome the difficult case – when the source is far from T.P. 0.8 sim<0.0001 0.0001<=sim<0.001 0.001<=sim<0.01 0.01<=sim<0.1 0.1<=sim Few long RW is better when S is far from T.P. but many short RW is better when S is close to T.P. 0.6 0.4 0.2 0 Flooding Flooding_sim Gradient_acsend Few long RW Many short RW Performance Comparison Gradient ascend helps to overcome the difficult case – when the source is far from T.P. 3000000 Delay - Large groups Few long RW is better when S is close toT.P. but many short RW is better when S is close to T.P. sim<0.0001 0.0001<=sim<0.001 0.001<=sim<0.01 0.01<=sim<0.1 0.1<=sim 2500000 2000000 Gradient ascend 1500000 has some extra delay comparing 1000000 with flooding 500000 0 Flooding Flooding_sim Gradient_acsend Few long RW Many short RW Mobility Independent Profile-cast Goal Flooding S SmallWorld-based S Single long random walk S S Multiple short random walks S