Skoll: Distributed Continuous Quality Assurance Atif Memon, Adam Porter, Cemal Yilmaz, Adithya Nagarajan Douglas Schmidt, Balachandran Natarajan University of Maryland College Park Vanderbilt University Cemal Yilmaz Quality Assurance Never Sleeps: The ACE+TAO Example ACE+TAO Large user community – 20,000+ users worldwide Large code base – 2M+ lines of C++ code Geographically-distributed developers Continuous evolution – 200+ CVS commits per week Highly characteristics configurable program family Over 500 configuration options Dozens of OS, compiler and platform combinations Continuous QA 40+ workstations & servers continuously perform functional testing and performance evaluation Still doesn’t cover all configurations Forced to release untested code 6/27/2016 http://www.cs.umd.edu/projects/skoll 2 Persistent Challenges and Emerging Opportunities Persistent Challenges Scale Time to market pressure Scattered resources & distributed developers Incomplete information Rapid updates, frequent releases Heterogeneity Massive configuration & optimization space Unobservable usage context Emerging Opportunities 6/27/2016 Leverage remote computing resources and network ubiquity for distributed, continuous QA http://www.cs.umd.edu/projects/skoll 3 Existing DCQA Approaches Auto-build scoreboard systems Error reporting based on prepackaged installation tests e.g. GNU GCC and ACE & TAO Online crash reporting e.g. Mozilla’s Tinderbox e.g. Netscape’s QFA and Microsoft’s Watson Shortcomings: inadequate, opaque, inefficient and inflexible QA processes 6/27/2016 Scope: Generally restricted to functional testing and often incomplete Documentation: No knowledge of what has or hasn’t undergone QA Control: Developers have no control over the QA processes Adaptation: Can’t learning from the earlier test results http://www.cs.umd.edu/projects/skoll 4 The Skoll Project Vision: QA processes conducted around-the-world, around-the-clock on powerful, virtual computing grid provided by thousands of user machines during off-peak hours Generic Skoll DCQA Process Distributed Opportunistic When a node becomes available allocate one or more subtasks to it Distribute subtasks to node and collect results when available Adaptive Identify QA task (e.g., testing, profiling, anomaly detection of program family) Divide goal into subtasks each of which can be performed on a single processing node (i.e., user machines) Use data-driven feedback to schedule and coordinate subtask allocation We are currently building an infrastructure, tools and algorithms for developing and executing thorough, transparent, managed, adaptive DCQA processes 6/27/2016 http://www.cs.umd.edu/projects/skoll 5 Configuration Model Option Settings The Skoll System Interpretation Compiler {gcc2.96, SUNCC5_1} Compiler AMI {1 = Yes, 0 = NO} Enable Feature CORBA_MSG {1 = Yes, 0 = NO} Enable Feature CALLBACK {1 = Yes, 0 = NO} Enable Feature POLLER {1 = Yes, 0 = NO} Enable Feature run(T) {1 = Yes, 0 = NO} Test T runnable Skoll Server(s) Constraints AMI = 1 → CORBA_MSG = 1 run(Multiple/run_test) = 1 → (Compiler = SUNCC5_1) Intelligent Steering Agent Configuration Model Automatic Adaptation Subtask Visualization (ISA) characterization strategies Code Adaptation Configuration e.g., e.g., e.g., classification nearest scoreboard neighbor trees <sub-task> Intelligent Steering Agent (ISA) Adaptation Strategies Configuration Model … Intelligent Steering Agent Output: Subtask Strategies Subtask Results Option Client Characteristic Model Settings Interpretation <download> {gcc2.96, SUNCC5_1} Compiler AMI {1 = Yes, 0 = NO} Enable Feature <cvs>cvs.doc.wustl.edu</cvs> … Adaptation strategies{1 = Yes, 0 = NO} CORBA_MSG Enable Feature e.g., nearest neighbor CALLBACK {1 = Yes, 0 = NO} Enable Feature <module>ACE+TAO</module> Output: Subtask Intelligent Steering POLLER {1 = Yes, 0 = NO} Enable Feature <version>v5.2.3</version> Agent run(T) {1 = Yes, 0 = NO} Test T runnable </download>Constraints AMI = 1 → CORBA_MSG = 1 … = 1 → (Compiler = SUNCC5_1) run(Multiple/run_test) Sub-task Code Subtask </subtask> Client Results Characteristic subtask res. subtask register req. <sub-task> <download> <cvs>cvs.doc.wustl.edu</cvs> <module>ACE+TAO</module> <version>v5.2.3</version> </download> … </subtask> subtask client kit Compiler Automatic characterization e.g., classification trees Visualization e.g., scoreboard Skoll Clients Configuration Model Options Configuration each with a discrete number of settings Mapping of each option to one of its settings Inter-option Constraints 6/27/2016 Not all possible configurations are valid Constraints are represented as Pi Pj If predicate Pi evaluates to TRUE, then predicate Pj must evaluate to TRUE http://www.cs.umd.edu/projects/skoll 7 Example Configuration Model for ACE+TAO Options Compilation (features, libraries, physical properties) Test case (properties required by test cases) Runtime (policies, runtime optimizations) Sample configuration options type value space TAO_HAS_AMI compile-time {0, 1} TAO_HAS_MINIMUM_CORBA compile-time {0, 1} ORBCollocation runtime {global, per-orb, no} ORBConnectionPurgingStrategy runtime {lru, lfu, fifo, null} A sample constraint on compile-time options (TAO_HAS_AMI = 1) (TAO_HAS_MINIMUM_CORBA = 0) A sample constraint on tests run(ORT/run_test.pl) => (TAO_HAS_MINIMUM_CORBA = 0) Constraints on runtime options – N/A 6/27/2016 http://www.cs.umd.edu/projects/skoll 8 Subset of ACE+TAO Configuration Space 6/27/2016 http://www.cs.umd.edu/projects/skoll 9 Intelligent Steering Agent (ISA) Decides which QA tasks, in which order, to allocate to clients based on: Client characteristics Configuration model Previous subtask results (via adaptation strategies discussed later) Internals Uses AI planner to do constraint solving, scheduling, and learning Iteratively generates acceptable plans 6/27/2016 Initial state: default software configuration Goal state: description of the desired configuration partly specified by users Operators: inter-option constraints and knowledge of past executions Generated plan: QA task to be tested at client Navigation strategies: With/without replacement and pre-computed http://www.cs.umd.edu/projects/skoll 10 Subtask Code Example ISA generates subtask code for each incoming request <subtask> <download> <cvs>cvs.doc.wustl.edu</cvs> <module>ACE+TAO</module> <version>v5.2.3</version> </download> <configure> <option name=“AMI” val=“0” /> <option name=“MSG” val=“1” /> </configure> <build target=“ACE” /> <build target=“TAO” /> <build target=“tests/HelloWorld” /> … <run-test target=“tests/HelloWorld” /> <upload target=“all-activity” /> </subtask> 6/27/2016 http://www.cs.umd.edu/projects/skoll 11 Default behavior 6/27/2016 http://www.cs.umd.edu/projects/skoll 12 Adaptation Strategies Allow global QA process to adapt as knowledge changes Programs that process subtask results and modify ISA behavior, for example by: 6/27/2016 Changing subtask priorities Introducing new planning goals Changing configuration model Terminating process http://www.cs.umd.edu/projects/skoll 13 Adaptation Strategies for ACE+TAO Terminate QA process Nearest Neighbor Search Strategy Temporary Constraints 6/27/2016 http://www.cs.umd.edu/projects/skoll 14 Nearest Neighbor Execution 6/27/2016 http://www.cs.umd.edu/projects/skoll 15 Nearest Neighbor Execution 6/27/2016 http://www.cs.umd.edu/projects/skoll 16 Nearest Neighbor Execution 6/27/2016 http://www.cs.umd.edu/projects/skoll 17 Automatic Fault Characterization Build classification tree models of QA results Analyze resulting trees for statistically-interesting patterns Use patterns to inform 1 visualization tools 1 CORBA_MESSAGING | 0 AMI_POLLER AMI 0 AMI_CALLBACK 1 0 OK 1 OK ERR-2 ERR-1 6/27/2016 0 http://www.cs.umd.edu/projects/skoll ERR-3 18 Feasibility Study Initial study focused on several basic testing scenarios Checking for clean-compile Regression testing with numerous compile-time options Regression testing with numerous compile and run-time options Skoll process performed on one stable release of ACE+TAO Skoll process run on 10 client machines located at UMD Conjecture: Skoll-supported process will be superior to ACE+TAO’s ad hoc QA processes 6/27/2016 Automatically manage and coordinate the QA process Detect problems more quickly on the average Automatically characterize test results to give insights to the problems http://www.cs.umd.edu/projects/skoll 19 Study #1 Task: Does ACE+TAO build with all compile-time options? Configuration Model Study Execution 17 binary-valued compile-time options with 35 inter-option constraints 82,000+ valid configurations Results and Observations Process terminated when first 500+ configurations failed to compile Cause: Action: 6/27/2016 7 options (that controlled corba messaging) weren’t functioning code had been modified, but not properly tested Developers removed the 7 options and corresponding constraints Plan to make these runtime options http://www.cs.umd.edu/projects/skoll 20 Study #1 (Cont.) Configuration Model Study Execution 10 binary compile-time options, 7 inter-option constraints 89 valid configurations Results and Observations 29 compiled fine 32 failed (orbconf.h, line 630) 8 failed (rt_init.cpp, line 137) Characterization : POLLER = 1 and CALLBACK = 0 Root cause: previously undiscovered bug Refine the model temporarily adding (POLLER = 1 ) (CALLBACK = 1) 20 failed (asynch_dispatcher.h, line 38) 6/27/2016 Characterization: AMI = 1 and CORBA_MSG = 0 Root cause: missing constraint Action: Refined model temporarily adding (AMI = 1) (CORBA_MSG = 1) Characterization: CORBA_MSG = 0 Root cause: missing #include, conditionally included when CORBA_MSG = 1 Refine the model temporarily adding (CORBA_MSG = 0 ) FALSE http://www.cs.umd.edu/projects/skoll 21 Study #2 Task: Do installation tests fail with default run-time options? This is ACE+TAO’s current behavior Configuration Model Study Execution 10 binary-valued configuration options with 12 inter-option constraints 96 tests with 120 test constraints 29 configurations that built fine from Study #1 Results and Observations 2077 test compilations, 98 failed to compile 7 tests failed to compile for the same 14 configurations 6/27/2016 Characterization: CORBA_MSG = 1 and POLLER = 0 and CALLBACK = 0 Root cause: CORBA messaging implementation mistakenly assumes that one of the POLLER and CALLBACK option would always be set to 1 Previously undiscovered http://www.cs.umd.edu/projects/skoll 22 Study #2 (Cont.) Default configuration was already well-tested 152 tests failed No evidence of option-related failures Results and Observations RTCORBA/Client_Protocol/run_test.pl failed 25 out of 29 times Persistent_IOR/run_test.pl failed in 1 configuration Not seen before by developers and hard to recreate Root cause: unknown MT_Timeout/run_test.pl failed in 14 configurations 6/27/2016 No clear patterns in passing configurations Root cause: race condition in Shared Memory Internet Object Protocol (SHMIOP) implementation No clear characterization Root cause: Responses to certain requests exceeded allowable limits Multiple underlying failures? or Not related to configuration options? http://www.cs.umd.edu/projects/skoll 23 Study #3 Task: Do tests run error free in different run-time configurations Configuration Model 10 binary-valued configuration options with 12 inter-option constraints 96 tests with 120 test constraints 6 run-time configuration options with no constraints Study Execution 6/27/2016 18,792 valid configurations (648 run-time X 29 compile-time) 9,400 hours (30 minutes per test suite) Nearest neighbor adaptation strategy http://www.cs.umd.edu/projects/skoll 24 Study #3 Results and Observations Several tests failed even though they had not failed in Study #2 Some failed on every single configuration Root cause: problems in feature-specific code Root cause: problems in run-time option setting and processing 3 tests failed between 2,500 and 4,400 times Characterization: ORBCollocation = NO Background: when ORBCollocation = 6/27/2016 %99.6 of Big_Twoways/run_test.pl failures %99.5 of Param_Test/run_test.pl failures %99.9 of MT_BiDir/run_test.pl failures YES: local objects communicate directly NO: objects communicate over network Root cause: data marshalling/unmarshalling broken http://www.cs.umd.edu/projects/skoll 25 Some Lessons Learned Skoll approach covered configuration space better than ACE+TAO’s ad hoc approach Quickly flagged real problems; some undiscovered previously Led ACE+TAO developers to change software in some cases Approach scaled well to larger configuration space Configuration model was easy to extend Added temporary constraints to work around some errors Automatic fault characterization helped to find the root causes of failures quickly As shape of failing subspace becomes apparent (statistically), need way to stop exploration 6/27/2016 Save time Explore other untested configurations http://www.cs.umd.edu/projects/skoll 26 Summary Basic infrastructure in place Initial results are encouraging Iteratively modeled complex configuration spaces to perform complex testing processes Found number of test failures corresponding to real bugs, some of which had not been found before Developers benefited from our automatic fault characterization for localizing the root causes of certain failures Many extensions currently in the works 6/27/2016 http://www.cs.umd.edu/projects/skoll 27 Skoll Extensions Incorporate factor-covering designs (ISSTA’04) Add performance-related QA tasks Used to generate test data that cover all t-way interactions of input space Will extend to allow detection of option settings affecting failure Refactoring of ACE to shrink its memory footprint and enhance its run-time performance Enhance ISA to include new (cost-based) scheduling models Enhance Skoll infrastructure Take over ACE+TAO daily build process 6/27/2016 feasibility study using hundred of machines around the world http://www.cs.umd.edu/projects/skoll 28