Automating Bypass Testing for Web Applications Vasileios Papadimitriou vpapadim@gmu.edu The Volgenau School of Information Technology & Engineering Dept. of Information & Software Engineering George Mason University Fairfax, VA USA Aug. 2, 2005 Vasileios Papadimitriou 1 Introduction • World Wide Web changed the methods of software development and deployment – We value reliability, usability, and security more than “time to market” – “Extremely loosely coupled” systems – Browser based clients – HTTP • Web applications become vulnerable to input manipulation that may: – Reduce reliability – Compromise security Aug. 2, 2005 Vasileios Papadimitriou 2 Introduction (cont.) • Offutt and Wu's work on bypass testing of web application is extended – Theoretical background is revised to support use of automated approach • HttpUnit is used to build a prototype software application that automatically: – – – – Parses HMTL pages Identifies forms and their fields Creates bypass test cases Submits test cases to the application’s server Aug. 2, 2005 Vasileios Papadimitriou 3 Presentation Outline • Client side validation types & rules to automatically generate test cases • AutoBypass testing tool and demo • Experiment design • Results • Conclusions Aug. 2, 2005 Vasileios Papadimitriou 4 Types of Client Input Validation • Client side input validation is performed by HTML form controls, their attributes, and client side scripts that access DOM • Validation types are categorized in HTML and Scripting. – HTML supports syntactic validation – Client scripting can perform both syntactic and semantic validation HTML Constraints Scripting Constraints • Length (max input characters) • Value (preset values) • Transfer Mode (GET or POST) • Field Element (preset fields) • Target URL (links with values) • Data Type (e.g. integer check) • Data Format (e.g. ZIP code format) • Data Value (e.g. age value range) • Inter-Value (e.g. credit # + exp. date) • Invalid Characters (e.g. <,../,&) Aug. 2, 2005 Vasileios Papadimitriou 5 Example Interface: yahoo registration form Preset Transfer Mode in form definition (HTML) Preset Values (HTML) Limited Length (HTML) URL with preset Values (HTML) Inter Value validation (script) Data Value, Type, & Format validation (script) Aug. 2, 2005 Vasileios Papadimitriou Preset No of Fields (HTML) 6 Test Value Selection • Challenge: – How to automatically provide effective test values? • “Semantic Domain Problem” (SDP) – Values within the application domain are needed – Enumeration of all possible test values is inefficient • Possible Solutions – – – – Random Values (ineffective) Automatically generated values (too hard) Study application and construct a set of values (feasible) Tester input (feasible) • AutoBypass uses a input domain created by parsing the interface and tester input Aug. 2, 2005 Vasileios Papadimitriou 7 AutoBypass • AutoBypass Steps (the big picture) Parse Interface Set Default Values Generate Test Cases & Run Tests Review Results • All HTML violation rules are used to generate test cases • This version of AutoBypass does NOT automatically violate scripting validation, but: – AutoBypass behaves as a browser with scripts disabled – Tester can provide test inputs that will bypass scripting validation. Aug. 2, 2005 Vasileios Papadimitriou 8 AutoBypass • Demo: 69.255.103.24:8080/AutoBypass/ Localhost:8080/AutoBypass Aug. 2, 2005 Vasileios Papadimitriou 9 AutoBypass Architecture v Aug. 2, 2005 Vasileios Papadimitriou 10 Experiment Design How well can the tool perform on real web applications? • Null Hypothesis: – Bypass testing of web applications will NOT expose more faults than standard testing. • Independent Variable: – Method of testing web applications. – Two values are compared: • Bypass method • Industry standard testing method Aug. 2, 2005 Vasileios Papadimitriou 11 Experiment Design (cont.) Dependent Variable: • Type of the server response given an invalid request submission: – (V) Valid Responses: invalid inputs are adequately processed by the server – (F) Faults & Failures: invalid inputs that cause abnormal server behavior (typically caught by web server when application fails to handle the error) – (E) Exposure: invalid input is not recognized by the server and abnormal software behavior is exposed to the users * both F & E are invalid responses Aug. 2, 2005 Vasileios Papadimitriou 12 Experiment Design (cont.) • Appropriateness vs. Expectancy – Responses for Invalid inputs are not defined • Preliminary results show a variety of “valid” responses – Further classification is defined (V1) Server acknowledges the invalid request and provides an explicit message regarding the violation (V2) Server produces a generic error message (V3) Server apparently ignores the invalid request and produces an appropriate response (V4) Server apparently ignores the request completely • It is unknown whether valid responses have actually resulted to corrupted data on the server. Aug. 2, 2005 Vasileios Papadimitriou 13 Subject Selection • Criteria: – Complexity of the application – Ability to perform bypass testing • Assumptions for web applications tested: – Products designed by professionals – Tested by their designers (yet testing methods are not well known or well defined) – Used by significant number of users Aug. 2, 2005 Vasileios Papadimitriou 14 Subjects atutor.ca Atalker nytimes.com Us-markets demo.joomla.or mutex.gmu.edu Poll, Users Login form phpMyAdmin Main page, Set Theme, SQL Query, DB Stats brainbench.com Submit Request Info, New user myspace.com Events & Music Search Aug. 2, 2005 yahoo.com Notepad, Composer, Search reminder, Weather Search barnesandnoble.com Cart manager, Book search/results bankofamerica.com ATM locator, Site search comcast.com Service availability ecost.com Detail submit, Shopping cart control google.com Froogle, Language tools pageflakes.com Registration amazon.com Item dispatch, Handle buy Vasileios Papadimitriou wellsfargolife.com Quote search 15 Results (1 of 2) Aug. 2, 2005 Vasileios Papadimitriou 16 Results (2 of 2) Aug. 2, 2005 Vasileios Papadimitriou 17 Result Graphs v Aug. 2, 2005 Vasileios Papadimitriou 18 Results Summary • 24% of tests caused invalid responses • Hypothesis is rejected * with the exception of Google and Amazon • Problems Found: – Crashes and incorrect output (and possibly corrupt data on the servers) – Potential security vulnerabilities • Invalid input passed to the application without validation • Invalid input reached database queries Aug. 2, 2005 Vasileios Papadimitriou 19 Results Summary (cont.) • Testing Cost – Average of 1.8 hours per module tested ~ 1¾ hours of human labor & 5 minutes computer processing • Violation Rules effectiveness Aug. 2, 2005 Vasileios Papadimitriou 20 Confounding Variables • AutoBypass Implementation – Tested for validity of results – Some Violation rules are not implemented (Scripting rules) • Sample Selection – Complex interfaces could not be parsed – Selected only public, non-critical applications – Some interfaces had to be modified to allow testing Aug. 2, 2005 Vasileios Papadimitriou 21 Confounding Variables (cont.) • Tester Value Selection – Selection of additional values that violated the constraints – Little or no familiarity with the application domain • Result Evaluation – Challenging process ~ 90% of the testing cost – No access to server –faults may not be detected – Manual verification – Cross Rater evaluation would be helpful Aug. 2, 2005 Vasileios Papadimitriou 22 Conclusions • Bypass testing can reveal errors in web applications beyond what standard testing can find – Programs are still designed to depend on client’s side interface constraints – Subjects with significant number of users were less affected • Assumed to be the most expensive software • Web development can benefit from bypass testing – Inexpensive to test applications in terms of resources and human labor. – Efficient method creating limited test cases – AutoBypass performs testing on external system level • Access to the application source or server is NOT required. • Platform independent • Can be combined with standard testing. Aug. 2, 2005 Vasileios Papadimitriou 23 Ways to improve AutoBypass • Improve interface parser – Eliminate scripting limitations • Implement scripting violation rules • Widen the scope of testing from a form/page to a site – Test sequence of events – Application level Input Domain • Explore possibilities for automated response evaluation Aug. 2, 2005 Vasileios Papadimitriou 24 Questions? Vasileios Papadimitriou vpapadim@gmu.edu Aug. 2, 2005 Vasileios Papadimitriou 25