Dynamic Software Update Testing: Framework and Empirical Study

Dynamic Software Update Testing: Framework and Empirical Study Christopher M. Hayden, Eric A. Hardisty, Michael Hicks, Jeffrey S. Foster University of Maryland, College Park Dynamic Software Updating (DSU)  Performing updates to software at runtime has clear benefits:  Increased software availability  No need to terminate active connections / computation  … but can we trust updated software?  Critical to ensure updates are safe 2 Our Contributions  Verification of DSU through testing:  Testing Procedure  Test Minimization Algorithm  Empirical Study:  Effectiveness of Minimization  Update Safety / Effectiveness of Safety Checks 3 DSU Safety  DSU creates the opportunity for new sources of bugs:  Faulty state transformation  Unsafe update timing  Safety Checks – restrict when updates may be applied  Activeness Safety / Con-freeness Safety 4 Activeness Safety (AS)  AS prevents updates to active code  In this example, no patch updating main or foo is allowed: main() { foo() { … foo(); … baz(); bar(); } } 5 Con-freeness Safety (CFS)  CFS (Stoyle, et al ‘05) allows updates to active code only when type safety can be ensured  In this example, no patch updating the signature of baz or bar is allowed: main() { foo() { … foo(); … baz(); bar(); } } 6 Unsafe Timing: Type Safety Version 0 Version 1 (patch) int foo(int x, int y) { return x + y; } void foo(int *x, int y) { *x += y; } crash void bar() { int z = 0; … z = foo(z, 5) } void bar() { int z = 0; … foo(&z, 5) } 7 DSU Testing  Safety Checks offer limited guarantees:  CFS and AS ensure type-safe execution  AS ensures that you never return to old code following an update  Neither of these properties ensure safe update timing  We propose testing to verify the correctness of allowed update points:  Use existing suite of application system tests  Ensure that updating anywhere during the execution of those tests results in an execution that passes the test. 8 Testing Procedure  Approach: Trace Start  Instrument application to trace update points  Execute system test and gather initial trace Potential Update Points  For each update point in the initial trace, perform an update test: force an update at that point while executing the system test 9 Testing Procedure  Approach:  Instrument application to trace update points  Execute system test and gather initial trace  For each update point in the initial trace, perform an update test: force an update at that point while executing the system test ✔ initial trace 10 Testing Procedure  Approach:  Instrument application to trace update points  Execute system test and gather initial trace  For each update point in the initial trace, perform an update test: force an update at that point while executing the system test ✔ ✔✘ ✔ initial trace update tests 11 Update Test Minimization  Program traces may have thousands or millions of update points  Many update tests have the same behavior for a given patch  we can eliminate redundant tests Version 0 void main() { foo(); bar(); baz(); } Patch A Patch B baz() {…} foo() {…} bar() {…} baz() {…} All update points yield All update points same behavior yield distinct behavior12 Minimization Algorithm  Execution events are traced if they have the potential to conflict with a patch  A event conflicts with a patch p if applying p before the event might produce a different result than applying p after the event  Example: function calls, global variable accesses  Trace the execution of a test T on P0  Iterate through the trace noting the last update point each time we reach a conflicting trace element  Run only the identified update tests Tnp 13 Empirical Results 14 Experimental Setup  Based testing infrastructure on top of the Ginseng DSU system (Neamtiu, et al):  Modified to support tracing and updating at pre- selected update points  Insertion of explicit update points before each function call to approximate more liberal systems  Disabled safety checking (CFS) for experiments  Tested 3 years of patches to OpenSSH and vsftpd (only report OpenSSH in this talk) 15 Program Modifications foo() { while (1) { // main loop update(); extract { ... // main loop body } } extract { ... // after main Loop } Identify Long-running loops Add a Manually Selected Update Point Perform Loop Body Extraction Perform Continuation Extraction } 16 Experiments: Update Test Suite  How many update tests must be run to test real- world updates to real-world applications?  How effective is minimization at eliminating redundant tests? 17 Update Test Suite Size: OpenSSH D to next version Reduction # Tests Sig Fun Type All Points Activeness-Safe Points 0 75 3 98 5 580,871 g 31,791 (95%) 35,314 g 3,027 (91%) 1 75 0 6 0 705,322 g 1,795 (~100%) 587,578 g 1,717 (~100%) 2 76 5 238 11 638,720 g 63,011 (90%) 20,902 g 2,353 (89%) 3 91 0 18 0 772,198 g 4,324 (99%) 638,803 g 3,775 (99%) 4 91 13 172 10 773,086 g 27,399 (96%) 21,343 g 1,564 (93%) 5 104 0 24 1 878,235 g 17,398 (98%) 111,950 g 1,723 (98%) 6 104 6 257 10 879,668 g 47,092 (95%) 44,278 g 2,139 (95%) 7 104 4 179 12 918,717 g 89,601 (90%) 100,854 g 4,141 (96%) 8 105 0 72 3 973,364 g 34,293 (96%) 61,724 g 2,070 (97%) 9 104 10 157 7 933,514 g 52,356 (94%) 61,051 g 2,891 (95%) Total 8,053,695 g 369,060 (95%) 1,683,797 g 25,400 (98%) 18 Empirical Study of Update Safety  How many failures occur when applying updates arbitrarily?  How many failures occur when applying updates subject only to the AS and CFS safety checks? 19 Safety: OpenSSH D to next version Update Tests Sig Fun Type All Points Failed Total CFS Points Failed Total AS Points Failed Total 0 75 3 98 5 19,715 580,871 0 68,044 0 35,314 1 75 0 6 0 0 705,322 0 705,322 0 587,578 2* 76 5 238 11 306,965 683,720 1,688 75,307 4 20,902 3 91 0 18 0 0 772,198 0 772,198 0 638,803 4* 91 13 172 10 565,681 773,086 609 110,633 380 21,343 5 104 0 24 1 10,703 878,235 0 130,000 0 111,950 6 104 6 257 10 163,333 879,668 44,461 96,183 110 44,278 7 104 4 179 12 11,380 918,717 1 80,070 1 100,854 8 105 0 72 3 3 973,364 0 261,885 0 61,724 9 104 10 157 7 357,919 933,514 24 121,337 0 61,051 Total 1,435,699 8,053,695 46,783 2,420,979 495 1,683,797 20 Unsafe Timing: Version Inconsistency Version 0 Version 1 (patch) void foo() { bar(); … baz(); } void foo() { bar(); … baz(); } void bar() { … } void bar() { dig(); … } void baz() { dig(); … } void baz() { … } Manually Selected Update Points D to next version Safety # Tests Sig Fun Type Reduction Failed Total 0 75 3 98 5 566 g 566 (0%) 0 566 1 75 0 6 0 630 g 592 (6%) 0 630 2 76 5 238 11 568 g 568 (0%) 0 568 3 91 0 18 0 783 g 770 (2%) 0 783 4 91 13 172 10 782 g 782 (0%) 0 782 5 104 0 24 1 860 g 841 (2%) 0 860 6 104 6 257 10 859 g 859 (0%) 0 859 7 104 4 179 12 850 g 850 (0%) 0 850 8 105 0 72 3 868 g 823 (5%) 0 868 9 104 10 157 7 833 g 833 (0%) 0 833 Tota l 7,59 9 g 7,48 4 (2% ) 0 7,59 9 22 Summary  We have argued that verification is necessary to prevent unsafe updates  Provided empirical evidence that AS/CFS cannot prevent all unsafe updates  We have presented an approach for testing dynamic updates  We have presented and evaluated a minimization strategy to make update testing more practical 23 Additional Slides 24 Unsafe Timing: Type Safety Version 0 Version 1 (patch) int foo(int x, int y) { return x + y; } void foo(int *x, int y) { *x += y; } crash void bar() { int z = 0; … z = foo(z, 5) } void bar() { int z = 0; … foo(&z, 5) } 25 Reduction: vsftpd D to next version Reduction # Sig Fun Typ e 0 0 6 0 210,142 g 26 (~100%) 102,307 g 26 (~100%) 1 1 12 0 210,142 g 516 (~100%) 69,775 g 166 (~100%) 2 0 21 0 215,223 g 1,122 (99%) 55,555 g 553 (99%) 3 0 76 0 220,564 g 3,866 (98%) 37,265 g 1,912 (95%) 4 0 10 1 218,586 g 19,893 (91%) 2,123 g 301 (86%) 5 0 25 1 223,098 g 15,910 (93%) 67,330 g 3,567 (95%) 6 0 100 2 223,199 g 200,653 (14%) 7,437 g 2,742 (63%) 7 0 93 2 222,296 g 10,371 (95%) 3,098 g 275 (91%) Total 1,753,250 g 252,357 (86%) 344,890 g 9,542 (97%) All Points Activeness-Safe Points 26 Safety: vsftpd D to next version # All Points Failed Total CFS Points Failed Total AS Points Sig Fun Type Failed 0 0 6 0 0 210,142 0 210,142 0 1 1 12 0 2,462 210,142 558 90,073 2 0 21 0 0 215,223 0 3 0 76 0 0 220,564 4 0 10 1 43,233 5 0 25 1 6 0 100 7 0 93 Total Manual Points Failed Total 35,314 0 80 0 587,578 0 80 215,223 0 20,902 0 80 0 220,564 0 638,803 0 80 218,586 546 4,478 0 21,343 0 80 58 223,098 0 24,924 0 111,950 0 80 2 2,115 233,199 0 3,737 0 44,278 0 82 2 234 222,296 0 1,993 0 100,854 0 80 Total 48,102 1,753,25 0 1,104 771,134 0 344,890 0 642 27 Which Tests? P0 Old Behavior Bugs & Deprecated Features P1 Unchanged Behavior New Behavior 28 Bug-fixes & New Features Nondeterminism  Program traces may differ between runs  Timing of signal handlers  Number of iterations of loops performing IO  Dependence on random numbers, system time, memory addresses, etc.  Handling nondeterminism:  Ensure that traces match up to update point  Annotate ignored regions of execution for which the produced trace is ignored for matching purposes 29 Program Versions vsftpd OpenSSH # Versio n LoC Tests D to next version Sig Fun Type # Versio n LoC Tests D to next version Sig Fun Type 0 3.5p1 46,73 5 75 3 98 5 0 2.0.0 13,04 8 13 0 6 0 1 3.6.1p1 48,45 9 75 0 6 0 1 2.0.1 13,05 9 13 1 12 0 2 3.6.1p2 48,47 3 76 5 238 11 2 2.0.2p2 13,11 4 13 0 21 0 3 3.7.1p1 50,44 8 91 0 18 0 3 2.0.2p3 14,29 3 13 0 76 0 4 3.7.1p2 50,46 0 91 13 172 10 4 2.0.2 16,87 0 13 0 10 1 5 3.8p1 51,82 2 104 0 24 1 5 2.0.3 12,97 7 13 0 25 1 6 3.8.1p1 51,83 8 104 6 257 10 6 2.0.4 14,42 7 14 0 100 2 7 3.9p1 53,26 0 104 4 179 12 7 2.0.5 14,48 2 13 0 93 230 Unsafe Timing: Version Inconsistency Version 0 Version 1 (patch) void foo() { bar(); … baz(); } void foo() { bar(); … baz(); } void bar() { … } void bar() { dig(); … } void baz() { dig(); … } void baz() { … } 31 Unsafe Timing: Version Inconsistency (vsftpd) Version 0 Version 1 (patch) void handle_upload_common() { void handle_upload_common() { ret = do_file_recv(); ret = do_file_recv(); if (ret == SUCCESS) write(226, “OK.”); } void do_file_recv() { … // receive file if (ret == SUCCESS) write(226, “OK.”); return ret; } } void do_file_recv () { … // receive file return ret; } 32 Unsafe Timing: Version Inconsistency (OpenSSH) Version 0 Version 1 (patch) void maincont() { extracted(); … serverloop2(); } void maincont() { extracted(); … serverloop2(); } void extracted() { … } void extracted() { global_ptr = init; } void serverloop2() { global_ptr = init; tmp = (*global_ptr).pw; } void serverloop2() { tmp = (*global_ptr).pw; } 33 Activeness Safety (AS)  AS prevents updates to active code  In this example, no patch updating main or foo is allowed: main() { extracted(); foo(); … baz(); } extracted() { // initialization // code … } foo() { … bar(); } 34 Minimization Algorithm Initial Trace Update? (1) … Call(foo) Update? (2) … Call(bar) Update? (3) … Call(baz) p Algorithm State Last Update Pt: 1? Algorithm State Points To Test: {} Algorithm State Last Update Pt: 1 Points To Test: Last Update Pt:{}12 Algorithm State Points To Test:State {} Algorithm Last Update Pt: 2 Last Update Pt:{}2 3 Points To Test: Algorithm State Points To Test: {} Last Update Pt: 3 Points To Test: {{}3 } (patch A) baz() {…} 35 Minimization Algorithm Initial Trace Update? (1) … Call(foo) Update? (2) … Call(bar) Update? (3) … Call(baz) p Algorithm State Last Update Pt: 1? Algorithm State Points To Test: {} Algorithm State Last Update Pt: 1 Points To Test: Last Update Pt:{{}121 } Algorithm State Points To Test:State {1} Algorithm Last Update Pt: 2 Last Update Pt:{3 211,}2 } Points To Test: Algorithm AlgorithmState State Points To Test: { 1, 2 } Last Update Pt: 3 Points To Test: { 1, 2 2,}3 } (patch B) foo() {…} bar() {…} baz() {…} 36

Dynamic Software Update Testing: Framework and Empirical Study

Related documents

Products

Support

Dynamic Software Update Testing: Framework and Empirical Study

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib