SDIMS: A Scalable Distributed Information Management System

SDIMS: A Scalable Distributed Information Management System Praveen Yalagandula Mike Dahlin Laboratory for Advanced Systems Research (LASR) University of Texas at Austin July 12, 2016 Department of Computer Sciences, UT Austin 1 Goal  A Distributed Operating System Backbone  Information collection and management  Core functionality of large distributed systems  Monitor, query and react to changes in the system  Examples: System administration and management Service placement and location Sensor monitoring and control Distributed Denial-of-Service attack detection File location service Multicast tree construction Naming and request routing …………  Benefits     July 12, 2016 Ease development of new services Facilitate deployment Avoid repetition of same task by different services Optimize system performance Department of Computer Sciences, UT Austin 2 Contributions – SDIMS  Provides an important basic building block  Information collection and management  Satisfies key requirements  Scalability • With both nodes and attributes • Leverage Distributed Hash Tables (DHT)  Flexibility • Enable applications to control the aggregation • Provide flexible API: install, update and probe  Autonomy • Enable administrators to control flow of information • Build Autonomous DHTs  Robustness • Handle failures gracefully • Perform re-aggregation upon failures – Lazy (by default) and On-demand (optional) July 12, 2016 Department of Computer Sciences, UT Austin 3 Outline  SDIMS: a distributed operating system backbone  Aggregation abstraction  Our approach  Design  Prototype  Simulation and experimental results  Conclusions July 12, 2016 Department of Computer Sciences, UT Austin 4 Aggregation Abstraction Astrolabe [VanRenesse et al TOCS’03] f(f(a,b), 9 f(c,d)) f(a,b) 5  Attributes 4 f(c,d)  Information at machines  Aggregation tree  Physical nodes are leaves 2 d 1 2c 4a b  Each virtual node represents a logical group of nodes A2 A1 A0 • Administrative domains, groups within domains, etc.  Aggregation function, f, for attribute A  Computes the aggregated value Ai for level-i subtree • A0 = locally stored value at the physical node or NULL • Ai = f(Ai-10, Ai-11, …, Ai-1k) for virtual node with k children  Each virtual node is simulated by one or more machines  Example: Total users logged in the system  Attribute: numUsers  Aggregation Function: Summation July 12, 2016 Department of Computer Sciences, UT Austin 5 Outline  SDIMS: a distributed operating system backbone  Aggregation abstraction  Our approach  Design  Prototype  Simulation and experimental results  Conclusions July 12, 2016 Department of Computer Sciences, UT Austin 6 Scalability with nodes and attributes  To be a basic building block, SDIMS should support  Large number of machines • Enterprise and global-scale services • Trend: Large number of small devices  Applications with a large number of attributes • Example: File location system – Each file is an attribute – Large number of attributes  Challenges: build aggregation trees in a scalable way  Build multiple trees • Single tree for all attributes  load imbalance  Ensure small number of children per node in the tree • Reduces maximum node stress July 12, 2016 Department of Computer Sciences, UT Austin 7 Building aggregation trees: Exploit DHTs  A DHT can be viewed as multiple aggregation trees  Distributed Hash Tables (DHTs)  Each node assigned an ID from large space (160-bit)  For each prefix (k), each node has a link to another node • Matching k bits • Not matching on the k+1 bit  Each key from ID space mapped to a node with closest ID  Routing for a key: Bit correction • Example: from 0101 for key 1001 • 0101  1110  1000  1000  1001  Lots of different algorithms with different properties • Pastry, Tapestry, CAN, CHORD, SkipNet, etc. • Load-balancing, robustness, etc.  A DHT can be viewed as a mesh of trees  Routes from all nodes to a particular key form a tree  Different trees for different keys July 12, 2016 Department of Computer Sciences, UT Austin 8 DHT trees as aggregation trees  Key 111 111 11x 1xx 000 July 12, 2016 100 010 110 001 101 Department of Computer Sciences, UT Austin 011 111 9 DHT trees as aggregation trees  Keys 111 and 000 000 111 00x 11x 0xx 1xx 000 July 12, 2016 100 010 110 001 101 Department of Computer Sciences, UT Austin 011 111 10 API – Design Goals  Expose scalable aggregation trees from DHT  Flexibility: Expose several aggregation mechanisms  Attributes with different read-to-write ratios • “CPU load” changes often: a write-dominated attribute – Aggregate on every write  too much communication cost • “NumCPUs” changes rarely: a read-dominated attribute – Aggregate on reads  unnecessary latency  Spatial and temporal heterogeneity • Non-uniform and changing read-to-write rates across tree • Example: a multicast session with changing membership  Support sparse attributes of same functionality efficiently  Examples: file location, multicast, etc.  Not all nodes are interested in all attributes July 12, 2016 Department of Computer Sciences, UT Austin 11 Design of Flexible API  New abstraction: separate attribute type from attribute name  Attribute = (attribute type, attribute name)  Example: type=“fileLocation”, name=“fileFoo”  Install: an aggregation function for a type  Amortize installation cost across attributes of same type  Arguments up and down control aggregation on update  Update: the value of a particular attribute  Aggregation performed according to up and down  Aggregation along tree with key=hash(Attribute)  Probe: for an aggregated value at some level  If required, aggregation done to produce this result  Two modes: one-shot and continuous July 12, 2016 Department of Computer Sciences, UT Austin 12 Scalability and Flexibility  Install time aggregation and propagation strategy  Applications can specify up and down  Examples • Update-Local up=0, down=0 • Update-All up=ALL, down=ALL • Update-Up up=ALL, down=0  Spatial and temporal heterogeneity  Applications can exploit continuous mode in probe API • To propagate aggregate values to only interested nodes • With expiry time, to propagate for a fixed time  Also implies scalability with sparse attributes July 12, 2016 Department of Computer Sciences, UT Austin 13 Scalability and Flexibility  Install time aggregation and propagation strategy  Applications can specify up and down  Examples • Update-Local up=0, down=0 • Update-All up=ALL, down=ALL • Update-Up up=ALL, down=0  Spatial and temporal heterogeneity  Applications can exploit continuous mode in probe API • To propagate aggregate values to only interested nodes • With expiry time, to propagate for a fixed time  Also implies scalability with sparse attributes July 12, 2016 Department of Computer Sciences, UT Austin 14 Scalability and Flexibility  Install time aggregation and propagation strategy  Applications can specify up and down  Examples • Update-Local up=0, down=0 • Update-All up=ALL, down=ALL • Update-Up up=ALL, down=0  Spatial and temporal heterogeneity  Applications can exploit continuous mode in probe API • To propagate aggregate values to only interested nodes • With expiry time, to propagate for a fixed time  Also implies scalability with sparse attributes July 12, 2016 Department of Computer Sciences, UT Austin 15 Scalability and Flexibility  Install time aggregation and propagation strategy  Applications can specify up and down  Examples • Update-Local up=0, down=0 • Update-All up=ALL, down=ALL • Update-Up up=ALL, down=0  Spatial and temporal heterogeneity  Applications can exploit continuous mode in probe API • To propagate aggregate values to only interested nodes • With expiry time, to propagate for a fixed time  Also implies scalability with sparse attributes July 12, 2016 Department of Computer Sciences, UT Austin 16 Scalability and Flexibility  Install time aggregation and propagation strategy  Applications can specify up and down  Examples • Update-Local up=0, down=0 • Update-All up=ALL, down=ALL • Update-Up up=ALL, down=0  Spatial and temporal heterogeneity  Applications can exploit continuous mode in probe API • To propagate aggregate values to only interested nodes • With expiry time, to propagate for a finite time  Also implies scalability with sparse attributes July 12, 2016 Department of Computer Sciences, UT Austin 17 Autonomy  Systems spanning multiple administrative domains  Allow a domain administrator control information flow • Prevent external observer from observing the information in the domain • Prevent external failures from affecting the operations in the domain  Support for efficient domain wise aggregation  DHT trees might not conform  Autonomous DHT  Two properties 110 101 • Path Locality • Path Convergence 100  Reach domain root first July 12, 2016 111 010 000 001 Domain 1 Domain 2 Department of Computer Sciences, UT Austin 011 Domain 3 18 Outline  SDIMS: a distributed operating system backbone  Aggregation abstraction  Our approach     Leverage Distributed Hash Tables Separate attribute type from name Flexible API Prototype and evaluation  Conclusions July 12, 2016 Department of Computer Sciences, UT Austin 19 Prototype and Evaluation  SDIMS prototype  Built on top of FreePastry [Druschel et al, Rice U.]  Two layers • Bottom: Autonomous DHT • Top: Aggregation Management Layer  Methodology  Simulation • Scalability • Flexibility  Prototype • Micro-benchmarks on real networks – PlanetLab – CS Department July 12, 2016 Department of Computer Sciences, UT Austin 20 Simulation Results - Scalability  Methodology  Two points  Sparse attributes: multicast sessions with small size membership  Node Stress = Amt. of incoming and outgoing info  Max Node Stress an order of magnitude less than Astrolabe MAXdecreases node stressaswith 8 increases  Max Node Stress theparticipant number ofsize nodes 1e+07 Astrolabe 65536 Node Stress 1e+06 Astrolabe 4096 Astrolabe 256 100000 SDIMS 256 10000 SDIMS 4096 1000 SDIMS 65536 100 10 1 1 July 12, 2016 10 100 1000 Number of sessions Department of Computer Sciences, UT Austin 10000 100000 21 Simulation Results - Flexibility Number of messages per operation  Simulation with 4096 nodes  Attributes with different up and down strategies 10000 1000 Update-All Update-Local Up=5, down=0 100 10 1 Up=all, down=5 Update-Up 0.1 0.0001 July 12, 2016 0.01 1 100 Read to Write ratio Department of Computer Sciences, UT Austin 10000 22 Prototype Results  CS department: 180 machines (283 SDIMS nodes)  PlanetLab: 70 machines Planet Lab Latency (ms) Department Network 800 3500 700 3000 600 2500 500 2000 400 1500 300 1000 200 500 100 0 0 Update - All July 12, 2016 Update - Up Update - Local Update - All Update - Up Update - Local Department of Computer Sciences, UT Austin 23 Conclusions  SDIMS – basic building block for large-scale distributed services  Provides information collection and management  Our Approach  Scalability and Flexibility through • Leveraging DHTs • Separating attribute type from attribute name • Providing flexible API: install, update and probe  Autonomy through • Building Autonomous DHTs  Robustness through • Default lazy reaggregation • Optional on-demand reaggregation July 12, 2016 Department of Computer Sciences, UT Austin 24 For more information: http://www.cs.utexas.edu/users/ypraveen/sdims July 12, 2016 Department of Computer Sciences, UT Austin 25

SDIMS: A Scalable Distributed Information Management System

Related documents

Products

Support

SDIMS: A Scalable Distributed Information Management System

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib