Testing the Tester Measuring Quality of Security Testing Ofer Maor CTO, Hacktics OWASP & OWASP Israel WASC 2007 AppSec 2008 Conference Conference San Jose – Nov 2007 http://www.webappsec.org/ Copyright © 2007 - The OWASP Foundation Permission is granted to copy, distribute and/or modify this document under the terms of the Creative Commons Attribution-ShareAlike 2.5 License. To view this license, visit http://creativecommons.org/licenses/by-sa/2.5/ The OWASP Foundation http://www.owasp.org/ Introduction Security Testing is a Critical Element Part of the system’s security lifecycle Is in fact the QA of the security in the system Provides the only way to assess the quality of security Nonetheless, Quality is Uncertain No way to measure quality of security testing No certification or other type of quality ranking And even worse – We only know our testing failed when it’s too late. Even PCI and other regulations avoid this issue OWASP Israel 2008 Conference – Sep 2008 2 Introduction Creates a Huge Challenge for Organizations How to choose the right security testing solution How can we guarantee that security testing is sufficient Makes Budget Decisions Harder… How can we determine cost effectiveness How to deal with purchasing constraints In This Presentation We Will: Discuss aspects of security testing quality Discuss means of assessing quality of testing solutions OWASP Israel 2008 Conference – Sep 2008 3 Agenda Quality of Security Testing False Positives / False Negatives Coverage / Validity Business Impact / Threat Security Testing Approaches (Pros & Cons) Black/Grey/White Box Penetration Test vs. Code Review Automatic vs. Manual Testing the Tester Determining the Right Approach Evaluating Tools Quality Evaluating Services Quality OWASP Israel 2008 Conference – Sep 2008 4 Quality of Security Testing OWASP Israel 2008 Conference – Sep 2008 5 Quality of Security Testing Quality of Testing is Essentially Measured by Two Elements: False Negatives (Something was missed…) Most obvious problem Exposes the system to attacks False Positives (Something was made up…) Surprisingly – equal problem for enterprises Generates redundant work and effort Creates distrust (Cry Wolf Syndrome) Not necessarily technological (the flaw is there – but poses no real threat) OWASP Israel 2008 Conference – Sep 2008 6 False Negatives: Reasons Coverage of Tested Components URL/parameter/component missed Specific code section not reached in tested setup Flow Attributes Users Coverage of Tests Specific vulnerability not tested Specific variant not tested OWASP Israel 2008 Conference – Sep 2008 7 False Negatives: Reasons (Cont’d) Test Quality/Proficiency Poorly defined test Attack not properly created Expected result not properly defined Poorly run test – lack of proficiency Technology changed / security measures exist Test requires some modifications Requires evasion techniques Requires different analysis of results More… OWASP Israel 2008 Conference – Sep 2008 8 Coverage Problems Application Data Coverage Automatic crawling problems Practically infinite links Client-side created links (JS/AJAX) Proper flow and context data Availability 3rd Party components can not be tested Code unavailable Size Too many URLs/parameters/code to cover Insufficient time OWASP Israel 2008 Conference – Sep 2008 9 Coverage Problems Test Coverage Vulnerability not tested Impossible to test (Logical by Automated, Brute Force by Manual, etc.) Newly Discovered Vulnerability (Not up to date yet…) Seemingly Insignificant Vulnerability Never-before seen vulnerability (Mostly logical…) Test may impair availability or reliability Variant not tested Too many possible variants (common with injection problems) – may require very specific tweaking Logical vulnerability extremely dependant on actual application OWASP Israel 2008 Conference – Sep 2008 10 Scope of Threat Who is Trying to Attack Us? What do They Want? Not Just Security Testing - The entire security solution should be relative to the threat Security Testing Cost-Effectiveness: Simulates a script-kiddie / automated tools? Simulates an average hacker? Simulates focused attack with large resources? When is a Missed Vulnerability Classified as “False Positive” (And when is it out of scope?) OWASP Israel 2008 Conference – Sep 2008 11 False Positives: Reasons Test Quality/Proficiency Poorly defined test Expected result not properly defined Poorly defined validation differentiators Poorly run test – lack of proficiency Technology changed / security measures exist Result appears vulnerable Honeypots “Patched” security solutions Test unable to correlate to other components (code review) Technological vulnerability – no threat Test does correlate to context of the system OWASP Israel 2008 Conference – Sep 2008 12 Validity How Can We Tell if It’s Really Vulnerable? Probe Tests – Attempt to determine validity by a set of tests and expected results May poorly identify a result as a vulnerability Very susceptible to honey pots Exploits Take advantage of a vulnerability to achieve an actual attack Very high validity (Still might fall on honeypots though) Requires additional effort and may pose a risk OWASP Israel 2008 Conference – Sep 2008 13 Business Impact How Dangerous is This Vulnerability? Still Controversial – Do we really want to fix just vulnerabilities posing immediate threats? Can we call it a vulnerability if it does nothing? Associating Risk Organizations usually prioritize effort by risk Do we really give each vulnerability the same risk level in every system? Requires Contextual understanding of system OWASP Israel 2008 Conference – Sep 2008 14 Quality of Security Testing – Criteria Summary False Negatives Application data coverage Test coverage Test quality / proficiency Scope of threat False Positives Test quality / proficiency Validity Business context Cost Effectiveness (% of acceptable False Negatives and Positives) OWASP Israel 2008 Conference – Sep 2008 15 Security Testing Approaches (Pros & Cons) OWASP Israel 2008 Conference – Sep 2008 16 Security Testing Approaches (Pros & Cons) Black/Grey Box Application vulnerability scanners Manual penetration test White Box Static code analyzers Manual code review OWASP Israel 2008 Conference – Sep 2008 17 Application Vulnerability Scanners Application Data Coverage Good in terms of volume (Large applications) Problematic in contextual aspects Complex Flows Multiple User Privileges Specific data influences code executed Coverage of Tests Generally good with Technical Vulnerabilities Very limited with Logical Vulnerabilities Variant Coverage – Wide, but not adaptive OWASP Israel 2008 Conference – Sep 2008 18 Application Vulnerability Scanners Test Quality / Proficiency Generally Good (Depends on product…) However, fails to adapt to changes and non standard environments Validity Limited – Generally high rate of False Positives No (or very limited) Exploits Validation differentiators suffer from test quality Business Impact None OWASP Israel 2008 Conference – Sep 2008 19 Application Vulnerability Scanners Scope of Threat Good against: Script Kiddies Tool based attacks Basic-level Hackers Usually insufficient with focused/advanced attacks Cost Effectiveness Can be High, under following circumstances: Coverage and Quality issues not present with tested technology Users of tools have sufficient security understanding Tool used for many scans OWASP Israel 2008 Conference – Sep 2008 20 Manual Penetration Testing Application Data Coverage Good in contextual aspects Allows properly utilizing the application Differences between users are usually clear May be problematic in volume aspects Nonetheless, proper categorization can solve volume issues Coverage of Tests Very good – if working methodologically Variant coverage Not necessarily wide However, Proper testing allows finding the right variants OWASP Israel 2008 Conference – Sep 2008 21 Manual Penetration Testing Test Quality / Proficiency Potentially good – but depends greatly on the person Main advantage – allows creativity and adaption to identify non standard vulnerabilities Validity Usually good – Easier for person to identify false positives Easier to perform exploits Business Impact Can be considered by tester OWASP Israel 2008 Conference – Sep 2008 22 Manual Penetration Testing Scope of Threat Depends on the actual effort and quality Can be used to face advanced focused attacks Cost Effectiveness Potentially high. May suffer when: Very large applications Need to only face automatic tools Effort invested (and cost) may vary greatly OWASP Israel 2008 Conference – Sep 2008 23 Manual Penetration Testing Additional Key Aspects Quality varies greatly between testers Quality may also vary for the same tester (time, mood, inspiration, etc.) Black vs. grey box Users Coverage Information Gathering Business impact Greatly depends on tester’s understanding How much effort do we want to invest? OWASP Israel 2008 Conference – Sep 2008 24 Static Code Analyzers Application Data Coverage Generally good (no crawling setbacks) Problematic when not all code available Coverage of Tests Generally good with technical vulnerabilities Very limited with logical vulnerabilities Variant coverage – wide, but not adaptive OWASP Israel 2008 Conference – Sep 2008 25 Static Code Analyzers Test Quality / Proficiency Generally good in some aspects (Depends on product…) High accuracy in identifying code violations However, not necessarily security violations Validity Very limited – high rate of false positives Code violations may often not be exploitable May be blocked by external components Business Impact None. There even is no application context (static) OWASP Israel 2008 Conference – Sep 2008 26 Static Code Analyzers Scope of Threat Good against: Mostly syntax attacks (Injections) Mostly script kiddies / tool based attacks Usually insufficient with focused/advanced attacks Cost Effectiveness Good for enforcing proper coding practices Requires, however, performing modifications of code not necessarily vulnerable OWASP Israel 2008 Conference – Sep 2008 27 Manual Code Review (Static) Application Data Coverage Can be problematic – Usually impossible to go over every line of code Requires smart analysis of what to review and what not to review Coverage of Tests Generally good with technical vulnerabilities Somewhat limited with logical vulnerabilities (often hard to determine full logic of non running code) OWASP Israel 2008 Conference – Sep 2008 28 Manual Code Review (Static) Test Quality / Proficiency Potentially excellent (if properly done) Allows identification of backdoors and rare issues May suffer from inability of understanding flow of complex systems Validity Limited – Relatively high rate of false positives Code violations may often not be exploitable May be blocked by external components Business Impact Partial. Depends on preliminary preparations OWASP Israel 2008 Conference – Sep 2008 29 Manual Code Review (Static) Scope of Threat Good against: High level of attackers Backdoors and rare problems Cost Effectiveness Usually expensive, and only justifies costs in critical parts OWASP Israel 2008 Conference – Sep 2008 30 Choosing the Right Approach Determining the Types of Threats Determining the Required Frequency Weighing Pros, Cons and Costs Usually – A combination of approaches applies: Manual Penetration Test + Partial Code Review Manual Penetration Test + Scanner (Free/Commercial) Scanner + Partial Manual Penetration Test (Validation) Scanner + Static Code Analyzer (Correlation) Static Code Analyzer + Manual Code Review (Validation) Etc… OWASP Israel 2008 Conference – Sep 2008 31 Testing the Tester: Evaluating Quality of Security Testing OWASP Israel 2008 Conference – Sep 2008 32 Testing the Tester The Hardest Part – Determining the quality of security testing solution: This Consists Of: Identify % of false negatives Identify % of false positives Other aspects (Not discussed here) Performance Management Reporting Etc. OWASP Israel 2008 Conference – Sep 2008 33 Testing the Tester How NOT to Determine Quality Marketing material Sales pitches Magazine articles (Usually not professional enough) Benchmarking on known applications (WebGoat, Hackme Bank, etc.) So What Should We Do? References (that we trust…) Comparative analysis (on our systems) Ideally – Compare with a “perfect” report OWASP Israel 2008 Conference – Sep 2008 34 Product Assessment Comparative Analysis Run several products on a few systems in the enterprise False Negatives Ideally – compare against a report containing all findings – identify percent of false negative in each product. Alternatively – unite the real findings from all reports, and compare against that False Positives Perform validation of each finding to eliminate all false positives Note the amount of false positives in each product Assess other aspects (if needed) – Details of report, speed of execution, etc. OWASP Israel 2008 Conference – Sep 2008 35 Service Assessment Much Trickier Hiring a consultant is like hiring an employee First of All - References Find references of customers with similar environments and needs Ask around yourself – the vendor will always provide you with their best references! Check the Specific Consultant It’s not just about the company – it’s about the people involved in the project Check the resume, perform an interview OWASP Israel 2008 Conference – Sep 2008 36 Service Assessment Comparative Analysis Similar to product – great way of comparing services Main problem – Usually expensive Important note – The benchmarking should be done without prior knowledge of the testers! False positive & negative assessment: Mostly similar Business impact, however, now plays a role – a good tester should eliminate (or downgrade) non hazardous findings See if testing includes strong validation (exploitation) Quality of report and information gathered in it should also be examined OWASP Israel 2008 Conference – Sep 2008 37 Some Techniques That (Sometimes) Help Application Data Coverage Build (and approve) a test plan List all components/modules tested Require an automated tool in addition Main problem – usually increases costs Test Coverage Review methodology / list of tests Main problem – does not really improve variant coverage Validation – Require strong validation (Exploits) OWASP Israel 2008 Conference – Sep 2008 38 Summary Quality of Security Testing is Hard to Measure or Quantify Nonetheless – It is Important to Maintain Adequate Quality to Address the Threat Quality of Security (and Security Testing) Must be Guaranteed In Advance Maintaining Quality has an Associated Cost: Testing for quality Best products, best tools, best consultants Finding the Balance is Crucial OWASP Israel 2008 Conference – Sep 2008 39 Thank You! Discussion & Questions OWASP Israel 2008 Conference – Sep 2008 40