COMPUTING ON JETSTREAM: STREAMING ANALYTICS IN THE WIDE-AREA Matvey Arye Joint work with: Ari Rabkin, Sid Sen, Mike Freedman and Vivek Pai THE RISE OF GLOBAL DISTRIBUTED SYSTEMS Image shows CDN TRADITIONAL ANALYTICS Centralized Database Image shows CDN BANDWIDTH IS EXPENSIVE Price Trends 2005-2008 CPU(16x) Storage(10x) Bandwidth(2.7x) [Above the Clouds, Armbrust et. al.] BANDWIDTH TRENDS 20% [TeleGeography's Global Bandwidth Research Service] 20% BANDWIDTH COSTS • Amazon EC2 bandwidth: $0.05 per GB • Wireless broadband: $2 per GB • Cell phone broadband (ATT/Verizon): $6 per GB – (Other providers are similar) • Satellite Bandwidth $200 - $460 per GB – May drop to ~$20 THIS APPROACH IS NOT SCALABLE Centralized Database Image shows CDN THE COMING FUTURE: DISPERSED DATA Dispersed Databases Dispersed Databases Dispersed Databases Dispersed Databases WIDE-AREA COMPUTER SYSTEMS • Web Services – – – – CDNs Ad Services IaaS Social Media • Infrastructure – Energy Grid • Military – – – – Global Network Drones UAVs Surveillance NEED QUERIES ON A GLOBAL VIEW • CDNs: – Popularity of websites globally – Tracking security threats • Military – Threat “chatter” correlation – Big picture view of battlefield • Energy Grid – Wide-area view of energy production and expenditure STANDING COMPUTATION Source Cube Processing Network bottleneck Processed Data Processed Data Source Cube Processing Processing Processed Data Union Cube To User SOME QUERIES ARE EASY Server Crashed Alert me when servers crash OTHERS ARE HARD Requests Requests Requests CDN Requests Requests Requests Requests CDN Requests How popular are all of my domains? Urls? BEFORE JETSTREAM Bandwidth Needed for backhaul 95% Level Time [two days] Analyst’s remorse: not enough data wasted bandwidth Buyers’s remorse: system overload or overprovisioning Latency Bandwidth WHAT HAPPENS DURING OVERLOAD? Needed for backhaul Available ?????? Time [one day] Queue size grows without bound! Time THE JETSTREAM VISION Bandwidth Needed for backhaul Used by JetStream Available Time [two days] JetStream lets programs adapt to shortages and backfill later. Need new abstractions for programmers SYSTEM ARCHITECTURE … worker node Optimized query JetStream API Query graph … … Planner Library Coordinator Daemon Control plane Data plane … stream source compute resources (several sites) AN EXAMPLE QUERY File Read Operator Parse Log File Local Storage Query Every 10 s Site A Central Storage Site B File Read Operator Parse Log File Local Storage Query Every 10 s Site C ADAPTIVE DEGRADATION Local Data Dataflow Operators Summarized or Approximated Data Network Feedback control Feedback control to decide when to degrade User-defined policies for how to degrade data Dataflow Operators MONITORING AVAILABLE BANDWIDTH Data Data Data Time Marker Data Sources insert time markers into the data stream every k seconds Network monitor records time it took to process interval – t => k/t estimates available capacity WAYS TO DEGRADE DATA Can coarsen a dimension Can drop low-rank values AN INTERFACE FOR DEGRADATION (I) Incoming data Coarsening Operator Sampled Data Network Sending 4x too much • First attempt: policy specified by choosing an operator. • Operators read the congestion sensor and respond. COARSENING REDUCES DATA VOLUMES 01:01:01 foo.com/a 1 01:01:02 foo.com/a 2 01:01:01 foo.com/b 10 01:01:02 foo.com/b 15 01:01:01 foo.com/c 5 01:01:02 foo.com/c 20 01:01:* foo.com/a 3 01:01:* foo.com/b 25 01:01:* foo.com/c 25 BUT NOT ALWAYS 01:01:01 foo.com/a 1 01:01:02 bar.com/a 2 01:01:01 foo.com/b 10 01:01:02 bar.com/b 15 01:01:01 foo.com/c 5 01:01:02 bar.com/c 20 01:01:* foo.com/a 1 01:01:* foo.com/b 10 01:01:* foo.com/c 5 01:01:* bar.com/a 2 01:01:* bar.com/b 15 01:01:* bar.com/c 20 DEPENDS ON LEVEL OF COARSENING 256 Compression factor 128 Domains URLs 64 32 16 8 4 2 1 5s minute 5m hour Aggregation time period Data from CoralCDN logs day GETTING THE MOST DATA QUALITY FOR THE LEAST BW Issue Some degradation techniques result in good quality but have unpredictable savings. Solution Use multiple techniques – Start off with technique that gives best quality – Supplement with other techniques when BW scarce => Keeps latency bounded; minimize analyst’s remorse ALLOWING COMPOSITE POLICIES Incoming data Coarsening Operator Sampling Operator Network Sending 4x too much • Chaos if two operators are simultaneously responding to the same sensor • Operator placement constrained in ways that don’t match degradation policy. INTRODUCING A CONTROLLER Incoming data Coarsening Operator Sampling Operator Network Controller Drop 75% of data! Sending 4x too much • Introduce a controller for each network connection that determines which degradations to apply • Degradation policies for each controller • Policy no longer constrained by operator topology DEGRADATION Type Mergeability Errors Predictable Size Savings Dimension Coarsening Consistent Sampling Yes* Resolution No Yes Sampling Yes Sampling None Yes No Depends Depends Local Filtering No Multi-round No Filtering Aggregate Depends Approx. MERGEABILITY IS NONTRIVIAL Every 5 01 - 05 06 - 10 11 - 15 16 - 20 21 - 25 26 - 30 ?????? Every 6 Every 10 Every 30?? 01 - 06 01 - 10 07 - 12 13 - 18 11 - 20 19 - 24 25 - 30 21 - 30 01 - 30 • Can’t cleanly unify data at arbitrary degradation • Degradation operators need to have fixed levels INTERFACING WITH THE CONTROLLER Incoming data Sampling Operator Coarsening Operator Network Controller Sending 4x too much Operator Shrinking data by 50% Possible levels: [0%, 50%, 75%, 95%, …] Go to level 75% Controller A PLANNER FOR POLICY Query planners: Query + Data Distribution => Execution Plan Why not do this for degradation policy? What is the Query? For us the policy affects the data ingestion => Effects all subsequent Queries Planning All Potential Queries + Data Distribution => Policy EXPERIMENTAL SETUP Princeton 80 nodes on VICCI testbed in US and Germany Policy: Drop data if insufficient BW WITHOUT ADAPTATION Bandwidth Shaping WITH ADAPTATION Bandwidth Shaping Bandwidth (Mbits/sec) COMPOSITE POLICIES No degradation Max window 5 Max window 10 8 Max window 5 + Threshold Max window 5 + Sampling 6 4 2 0 0 50 100 150 200 250 Experiment time (sec) 300 350 OPERATING ON DISPERSED DATA Dispersed Databases Dispersed Databases Dispersed Databases Dispersed Databases URL foo.com/r foo.com/q bar.com/n bar.com/m Time CUBE DIMENSIONS CUBE AGGREGATES bar.com/m 01:01:01 Time bar.com/* URL foo.com/r foo.com/q bar.com/n bar.com/m CUBE ROLLUP foo.com/* FULL HIERARCHY URL: * Time: 01:01:01 (37,199) (29,199) foo.com/r URL foo.com/q bar.com/n bar.com/m Time (8,90) RICH STRUCTURE Cell URL Time A bar.com/* 01:01:01 B * 01:01:01 C foo.com/* 01:01:01 D foo.com/r 01:01:* E foo.com/* 01:01:* TWO KINDS OF AGGREGATION 1. Rollups – Across Dimensions 2. Inserts – Across Sources The data cube model constrains the system to use the same aggregate function for both. Constraint: no queries on tuple arrival order Makes reasoning easier! AN EXAMPLE QUERY File Read Operator Parse Log File Local Storage Query Every 10 s Site A Central Storage Site B File Read Operator Parse Log File Local Storage Query Every 10 s Site C SUBSCRIBERS File Read Operator Parse Log File File Read Operator Parse Log File Local Storage Query Every 10 s Site A • Extract data from cubes to send downstream • Control latency vs. completeness trade-off SUBSCRIBER API Subscriber is an operator++: • Notified of every tuple inserted into cube • Can slice and rollup cube Possible policies: • Wait for all upstream nodes to contribute • Wait for a timer to go off FUTURE WORK • Reliability • Individual queries – Statistical methods – Multi-round protocols • Currently working on improving top-k • Fairness that gives best data quality Thanks for listening!