Distributed Maintenance of Cache Freshness in Opportunistic Mobile Networks Wei Gao and Guohong Cao Dept. of Computer Science and Engineering Pennsylvania State University Mudhakar Srivatsa and Arun Iyengar IBM T. J. Watson Research Center Outline Introduction Refreshing Patterns of Web Contents Cache Refreshing Schemes Performance Evaluation Summary & Future Work Opportunistic Mobile Networks Consist of hand-held personal mobile devices Laptops, PDAs, Smartphones Opportunistic and intermittent network connectivity Result of node mobility, device power outage, or malicious attacks Hard to maintain end-to-end communication links Data transmission via opportunistic contacts Communication opportunity upon physical proximity Methodology of Data Transmission Carry-and-Forward Mobile nodes physically carry data as relays Forwarding data opportunistically upon contacts Major problem: appropriate relay selection B 0.7 A 0.5 C Providing Data Access to Mobile Users Active data dissemination Data source actively push data to users being interested in the data Publish/Subscribe Brokers forward data to users according to their subscriptions Caching Determining appropriate caching location/policy The freshness of cached data is generally ignored Our Focus Maintaining the freshness of cached data Data may be periodically refreshed by the source Daily news, weather report Data cached at remote locations may be out-of-date! Major challenges Obtaining information of cached data Where data is cached? What is the current version of cached data? Timeliness of refreshing cached data Uncertainty of opportunistic data transmission Models Network model Pairwise inter-contact time: exponentially distributed Cache freshness model Version of source data in the past Version of data cached at node j at time t Probabilistic model determined by Data update model and p Version i of the data Difference between data version i and j Caching Scenario Query and response Requester locally stores the query, which is satisfied when the requester contacts some node caching data Afterwards, requester caches data locally Data Access Tree (DAT) Each node only has knowledge about data cached at its children Basic Idea Distributed and hierarchical refreshing Intentional refreshing A node only refreshes data cached at its children in the DAT Appropriate data updates are applied Opportunistic refreshing A node refreshes any cached data with old versions upon contact Complete data is transmitted Outline Introduction Refreshing Patterns of Web Contents Cache Refreshing Schemes Performance Evaluation Summary & Future Work Datasets Categorized web news from multiple websites 11 RSS feeds from CNN, New York Times, BBC, Google News, etc 3-week period over 7 categories of news Distribution of Inter-Refreshing Time Aggregate distribution Mixture of exponential and power-law distributions Distinct boundary Distribution of Inter-Refreshing Time Distributions of individual RSS feeds Similar characteristics with that of aggregate distribution Heterogeneous boundaries Temporal Variations Temporal distribution of news updates over different hours in a day Heterogeneity over different RSS feeds Significant heterogeneity Outline Introduction Refreshing Patterns of Web Contents Cache Refreshing Schemes Performance Evaluation Summary & Future Work Intentional Refreshing Analytically ensure that the freshness requirement of cached data can be satisfied Calculating the utility of data updates Opportunistic replication of data updates Utility of Data Updates B updates its children D in DAT: The probability to satisfy D’s freshness requirement Utility of Data Updates Exponential distribution The last time B contacts D Pareto distribution The minimum value of data inter-refreshing time Incomplete Gamma function Opportunistic Replication of Data Updates Replicate data updates to non-DAT relays The k selected relays satisfy: At least one relay could deliver the data update on time from S to B Opportunistic Refreshing Opportunistically update data with old versions upon contact Further improve freshness of cached data Probabilistic decision Complete data needs to be transmitted Data is only refreshed if the required freshness cannot be satisfied by intentional refreshing The probability for opportunistic refreshing: Opportunistic refreshing Intentional refreshing Side-Effect of Opportunistic Refreshing May hinder intentional refreshing in the future Inconsistency among different cached data copies A updates D’s cached data from d1 to d3 B cannot update D’s cached data to d4 using u14 Node A estimates chance of side-effect A newer version of data has already arrived B Outline Introduction Refreshing Patterns of Web Contents Cache Refreshing Schemes Performance Evaluation Summary & Future Work Experimental Settings Realistic mobile network traces Data generation 4 realistic RSS feeds, random nodes as data sources Query generation Randomly generated at all nodes Follows Zipf distribution over the 4 RSS feeds Performance of Maintaining Cache Freshness Infocom trace, hours, query time constraint T = 5 hours Our hierarchical refreshing scheme achieves higher refreshing ratio, shorter refreshing delay, and less refreshing overhead Variation of Parameters Varying the parameter Smaller is more difficult to be satisfied, and incurs higher overhead Temporal Variations DieselNet trace, hours, query time constraint T = 10 hours Transient performance of maintaining cache freshness expressed significant heterogeneity Summary Maintaining cache freshness in opportunistic mobile networks Probabilistic cache freshness model Experimental investigation on refreshing patterns of realistic web contents Approach to hierarchical and distributed maintenance Future work Exploitation of temporal variations of data refreshing patterns Thank you! http://mcn.cse.psu.edu The paper and slides are also available at: http://www.cse.psu.edu/~wxg139