Improving Software Security with Precise Static and Runtime Analysis Benjamin Livshits SUIF Compiler Group Computer Systems Lab Stanford University 1 http://suif.stanford.edu/~livshits/work/griffin/ Security Vulnerabilities: Last 10 Years 6,000 5,000 4,000 3,000 2,000 1,000 0 1995 2 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 *Source: NIST/DHS Vulnerability DB Which Security Vulnerabilities are Most Prevalent? • Analyzed 500 vulnerability reports, one week in November 2005 294 vuln. or 58% Input Validation Denial of Service Other File Include Authentication Bypass Tem p. File M anipulation M em ory Corruption Unauthorized Access Privilege Escalation Heap Overflow 0 3 50 100 150 200 250 300 350 *Source: securityfocus.com Focusing on Input Validation Issues SQL Injection Cross-site Scripting Buffer Overrun Information Disclosure Code Execution Other Path Traversal Web application vulnerabilities Format String Integer Overflow HTTP Response Splitting 0 4 20 40 60 80 100 *Source: securityfocus.com SQL Injection Example • Web form allows user to look up account details • Underneath: Java J2EE Web application serving requests String username = req.getParameter(“user”); String password = req.getParameter(“pwd”); String query = “SELECT * FROM Users WHERE username =“ + user + “ AND Password =“ + password; con.executeQuery(query); 5 Injecting Malicious Data (1) submit ... ... String query = “SELECT * FROM Users WHERE username = 'bob' AND password = ‘********‘”; ... 6 Injecting Malicious Data (2) submit ... ... String query = “SELECT * FROM Users WHERE username = 'bob‘-‘AND password = ‘ ‘”; ... 7 Injecting Malicious Data (3) submit ... ... String query = “SELECT * FROM Users WHERE username = 'bob‘; DROP Users-‘AND password = ‘‘”; ... 8 Attack Techniques for Taint-Style Vulnerabilities 1. Sources (inject) 2. Sinks (exploit) • • • • • • • • • • Parameter manipulation Hidden field manipulation Header manipulation Cookie poisoning Second-level injection SQL injections Cross-site scripting HTTP request splitting Path traversal Command injection 1. Parameter manipulation + 2. SQL injection = vulnerability 9 Goals of the Griffin Software Security Project Cross Site Scripting Cross Site Tracing HTTP Response mySpace SQL injection Hotmail Splitting HTTP Request (XSS) RI Government (persistent XSS) Path traversal Guess Smuggling (SQL XSS injection) DOM-Based Cookie Poisoning (SQL injection) Hundreds of Stanford Microsoft Passport sites Domain Contamination (XSS) (command injection) 2000 2001 2002 2003 2004 2005 • Financial impact – Cost per incident: $300,000+ – Total cost of online fraud: $400B/year • Griffin Project goals – Address vulnerabilities in Web applications – Focus on large Java J2EE Web applications 10 2006 Griffin Project Contributions 11 Effective solution of Web App Sec problems • Effective solution addressing a large range of real life problem in domain of Web Application Security Static analysis • Pushed state of the art in global static/pointer analysis; precisely handle large modern applications Runtime analysis • Design of an efficient dynamic techniques that recover from security exploits at runtime Experimental validation • Comprehensive, large scale evaluation of problem complexity and analysis features; discovered many previously unknown vulnerabilities Overview Static Extensions Overview of the Griffin Project Dynamic Experiments Conclusions Future 12 Griffin Project: Framework Architecture Vulnerability specification Static analysis [Livshits and Lam, Usenix Security ’05] Vulnerability warnings Pros: Find vulnerabilities early Explores all program executions Sound, finds all vuln. of particular kind 13 Application bytecode Provided by user Dynamic analysis [Martin, Livshits, and Lam, OOPSLA ’05] Instrumented application Pros: Keeps vulnerabilities from doing harm Can recover from exploits No false positives, but has overhead Following Unsafe Information Flow / Taint Flow “…” + “…” Servlet.getParameter(“user”) (source) sanitizer How do we know what these are? sink Statement.executeQuery(...) (sink) 14 Vulnerability Specification • User needs to specify – – – – Source methods Sink methods Derivation methods Sanitization methods • PQL: Program Query Language [Martin, Livshits, and Lam OOPSLA’05] query simpleSQLInjection returns object String param, derived; uses object HttpServletRequest req; object Connection con; object StringBuffer temp; matches { param = req.getParameter(_); – General language for describing events on objects • Real queries are longer – 100+ lines of PQL – Captures all vulnerabilities – Suitable for all J2EE applications 15 temp.append(param); derived = temp.toString(); } con.executeQuery(derived); Overview Static Extensions Static Analysis Dynamic Experiments Conclusions Future 16 Motivation: Why Pointer Analysis? String username = req.getParameter(“user"); list1.addFirst(username); ... String str = (String) list2.getFirst(); con.executeQuery(str); • What objects do username and str point to? • Question answered by pointer analysis – A classic compiler problem for 20 years+ – Rely on context-sensitive inclusion-based pointer analysis [Whaley and Lam PLDI’04] 17 Pointer Analysis Precision Runtime heap Static representation Static approximation o1 18 o2 o3 • Imprecision of pointer analysis → false positives • Precision-enhancing pointer analysis features • • • Context sensitivity [Whaley and Lam, PLDI’04] (not enough) Object sensitivity Map sensitivity h Importance of Context Sensitivity Imprecision → Excessive tainting → false positives tainted c1 c1 String id(String str) { return str; } untainted tainted c2 c2 points-to(vc : VarContext, Var, h : Heap) points-to(v : Var, hv :: Heap) Context Context insensitive sensitivity 19 Handling Containers: Object Sensitivity 1. String s1 = new String(); // h1 2. String s2 = new String(); // h2 3. 4. Map map1 = new HashMap(); 5. Map map2 = new HashMap(); 6. 7. map1.put(key, s1); 8. map2.put(key, s2); 9. 10. String s = (String) map2.get(key); points-to(*, s, h1) points-to(*, s, h2) points-to(*, s, *) points-to(vc points-to(vc points-to(vo : VarContext, : VarContext, Heap, vovo Var, : Var, vo h :2vh Heap) : :Heap, : Var, Heap)v : Var, points-to(vo Heap, 1:vHeap, 1::Heap, 2: :v ho ho ::Heap, Heap, hh ::Heap) Heap) 1-level 20 object 1-level Context Object sensitivity object sensitivity sensitivity sensitivity + context sensitivity Inlining: Poor Man’s Object Sensitivity • Call graph inlining is a practical alternative – – • • 21 Inline selected allocations sites • Containers: HashMap, Vector, LinkedList,… • String factories: String.toLowerCase(), StringBuffer.toString(), ... Generally, gives precise object sensitive results Need to know what to inline: determining that is hard – Inlining too much → doesn’t scale – Inlining too little → false positives – Iterative process Can’t always do inlining – Recursion – Virtual methods with >1 target Map Sensitivity 1. ... 2. String username = request.getParameter(“user”) 3. map.put(“USER_NAME”, username); ... “USER_NAME” ≠ “SEARCH_QUERY” 4. String query = (String) map.get(“SEARCH_QUERY”); 5. stmt.executeQuery(query); 6. ... • Maps with constant string keys are common • Map sensitivity: augment pointer analysis: – Model HashMap.put/get operations specially 22 Analysis Hierarchy Context sensitivity None Object sensitivity None Map sensitivity None 1-OS None Local flow k-CFA k-OS Constant keys Predicatesensitive ∞-CFA ∞-OS Symbolic analysis of keys Interprocedural predicate-sensitive Constant string keys Local flow ∞-CFA 23 Flow sensitivity Partial 1-OS PQL into Datalog Translation PQL Query Datalog Query actual(i temp.append(param); 3, v4, 0), ret(i3, v5), call(c derived = temp.toString(); 3, i3, "StringBuffer.toString"), pointsto(c3, v4, htemp), pointsto(c con.executeQuery(derived); 3, v5, hderived), } actual(i4, v6, 0), actual(i4, v7, 1), call(c4, i4, "Connection.execute"), pointsto(c4, v6, hcon), pointsto(c4, v7, hderived). 24 [Whaley, Avots, Carbin, Lam, APLAS ’05] Relevant instrumentation points Datalog solver simpleSQLInjection(h query simpleSQLInjection param, hderived ) :– ret(i returns 1, v1), call(c object "ServletRequest.getParameter"), param, derived; 1, i2, String pointsto(c uses 1, v1, hparam), object ServletRequest req; actual(i object , 0), actual(i2,con; v3, 1), 2, v2Connection call(c object "StringBuffer.append"), temp; 2, i2, StringBuffer pointsto(c matches {2, v2, htemp), pointsto(c param2,= vreq.getParameter(_); 3, hparam), Vulnerability warnings Eclipse Interface to Analysis Results • Vulnerability traces are exported into Eclipse for review – source → o1 → o2 → … → on → sink 25 Importance of a Sound Solution • Soundness: – only way to provide guarantees on application’s security posture – allows us to remove instrumentation points for runtime analysis • Soundness claim Our analysis finds all vulnerabilities in statically analyzed code that are captured by the specification 26 Overview Static Extensions Static Analysis Extensions Dynamic Experiments Conclusions Future 27 Towards Completeness • Completeness goal: specify roots discover the rest – analyze all code that may be executed at runtime 28 Generating a Static Analysis Harness public class Harness { <servlet> <servlet-name>blojsomcommentapi</servlet-name> public static void main(String[] args){ <servlet-class> org.blojsom.extension.comment.CommentAPIServlet processServlets(); </servlet-class> processActions(); <init-param> <param-name>blojsom-configuration</param-name> processTags(); </init-param> <init-param> processFilters(); <param-name>smtp-server</param-name> } <param-value>localhost</param-value> ...</init-param> <load-on-startup>3</load-on-startup> }</servlet> App App App web.xml web.xml web.xml Application Server (JBoss) 29 500,000+ lines of code 1,500,000+ lines of code 2M+ lines of code Reflection Resolution Constants 1. 2. 3. 4. Specification points String className = ...; Class c = Class.forName(className); Object o = c.newInstance(); T t = (T) o; 1. String className = ...; 2. Class c = Class.forName(className); Object o = new T1(); Object o = new T2(); Object o = new T3(); 4. T t = (T) o; 30 Q: what object does this create? [Livshits, Whaley, and Lam, APLAS’05] Reflection Resolution Results 18,000 16,000 Methods • Applied to 6 large Java apps, 190,000 lines combined Call graph sizes compared 14,000 12,000 10,000 8,000 6,000 4,000 2,000 0 31 jgap f reet t s grunt spud jedit columba jf reechart Overview Static Extensions Dynamic Analysis Dynamic Experiments Conclusions Future 32 Runtime Vulnerability Prevention App [Martin, Livshits, and Lam, OOPSLA’05] App 1. Detect and stop 2. Detect and recover Vulnerability specification App Application Server (JBoss) 33 Runtime Instrumentation Engine • PQL spec → into state machines – Run alongside program, keep track of partial matches – Run recovery code before match {x=o3} y := x {x=y=o3} ε ε {x=o3} ε {x=o3} t=x.toString() {x=o3} t.append(x) ε y := derived(t) {x=y=o3} ε ε y := derived(t) sanitizer 34 Reducing Instrumentation Overhead query simpleSQLInjection returns object String param, derived; uses object ServletRequest req; object Connection con; object StringBuffer temp; matches { param = req.getParameter(_); 1. 2. 3. 4. 5. 6. 7. String name = req.getParameter(“name”); StringBuffer temp.append(param); buf1 = new StringBuffer(); derived = temp.toString(); StringBuffer buf2 = new StringBuffer(“def”); con.executeQuery(derived); buf2.append(“abc”); } buf1.append(name); con.executeQuery(buf1.toString()); con.executeQuery(buf2.toString()); • Instrument events on objects that may be part of a match • Soundness allows to remove instrumentation points 35 Overview Static Extensions Experimental Results Dynamic Experiments Conclusions Future 36 Experimental Evaluation Benchmarks Total Lines SecuriBench Macro 11 3,203,698 SecuriBench Micro 102 5,588 Griffin • Comprehensive evaluation: – SecuriBench Macro [SoftSecTools ’05] – SecuriBench Micro – Google: SecuriBench • Compare Griffin to a commercially available tool – Griffin vs. Secure Software CodeAssure – CodeAssure: March 2006 version 37 CodeAssure 17,542 95,845 90 Lines of Code Benchmark Statistics 17,678 117,207 73 296 27 1.1 72,795 170,893 537 575 63 1.2.6 5,635 226,931 38 754300,00065 2.30 53,309 332,902 245 395200,00041 1.1 2,601 352,737 21 796100,00034 1.0-BETA-1 48,220 445,461 446 859 1.6-beta1 42,920 449,395 333 945 47 454,735 107 920 35 557,592 276 972 133 327,284 3,203,698 2,166 6,823 542 roller Total 38 0.9.9 14,495 52,089 Classes 600,000 Jars 311500,00029 400,000 0 roller 0.0.22 Files jorganizer jorganizer jgossip jboard pebble webgoat snipsnap personalblog road2hibernate pebble blojsom snipsnap personalblog Exp. LOC road2hibernate jgossip LOC blojsom Benchmark Version LOC jboard 0.3 Expanded LOC webgoat 0.9 68 SecuriBench Macro: Static Summary Benchmark Sinks Vulnerabilities False positives 1 8 2 0 blueblog 11 32 2 0 webgoat 13 47 5 0 personalblog 45 22 40 0 blojsom 34 29 6 0 138 88 16 0 2 13 13 0 pebble 123 39 0 0 roller 41 66 5 147 116 20 7 0 22 29 0 0 Totals 98 147 jboard snipsnap road2hibernate jorganizer jgossip 39 Sources Vulnerability Classification Sinks Sources SQL injections HTTP splitting Cross-site scripting Path traversal Totals Header manipulation 0 1 0 0 1 Parameter manipulation 55 11 2 10 78 Cookie poisoning 0 1 0 0 1 Non-Web inputs 15 0 0 3 18 Totals 70 13 2 13 98 • Reported issues back to program maintainers • Most of them responded, most confirmed as exploitable • Vulnerability advisories issued 40 SecuriBench Micro: Static Summary Category False Positives Vulnerabilities basic 41 59 3 collections 14 13 3 interprocedural 0 0 0 arrays 9 8 4 predicates 9 4 4 sanitizers 6 4 2 aliasing 6 11 1 data structures 6 5 1 strong updates 5 1 2 factories 0 0 0 session 3 3 1 102 111 21 Total 41 Tests A Study of False Positives in blojsom Base With context sensitivity Lines of Code 50 K Expanded LOC 333 K Classes 395 Sources 40 Sinks 29 Vulnerabilities 6 Q: How important are analysis features 114 for avoidingWithfalse positives? object sensitivity 84 43 42 With map sensitivity With sanitizers added 5 0 Griffin vs. CodeAssure Griffin CodeAssure 160 140 120 SecuriBench Macro 100 80 80+ 60 40 20 Q: What is the relationship between false Vulnerabilities False positives False negatives 0 positives and false negatives? 120 100 80 60 SecuriBench Micro 40 40+ 20 0 Vulnerabilities 43 False positives False negatives Deep vs. Shallow Vulnerability Traces 80 70 Number of paths 60 Q: How complex are the 50 vulnerabilities we find? 40 30 20 10 0 1 3 4 5 6 7 8 9 Path length 44 10 11 12 13 14 16 Analyzing personalblog Hibernate library code Application code 226,931 Analyzed lines of code 45 Total sources sinks 137 Total sinks Used sources 5 Used sinks 8 1,806 Reachable objects Paths through the graph 100 sf.hibernate.Session.find(…) Q: What is the connectivity between sources and sinks? 40 Source-sink invocation site pairs sources objects roller 1 falsely tainted object → 100+ false positives 45 Runtime Analysis Results • Experimental confirmation – Blocked exploits at runtime in our experiments • Naïve implementation – Instrument every string operation → high overhead • Optimized implementation – 82-99% of all instrumentation points are removed 140% Unoptim ized Overhead (% ) 120% Optim ized 100% 80% < 1% 60% 40% 20% 0% webgoat 46 personalblog road2hibernate snipsnap roller Overview Static Extensions Dynamic Experiments Conclusions Future 47 Related Work & Conclusions Lessons Learned • Context sensitivity is good. Object sensitivity is great, but hard to scale. Scaling it: important open problem • Can’t ignore reflection in large programs; reflection makes the call graph much bigger • Many of the bugs are pretty shallow; there are, however, complex bugs, especially in library code • Practical tools tend to introduce false negatives to avoid false positives; not necessarily a good choice • Automatic recovery from vulnerabilities is a very attractive approach; overhead can be reduced 48 Related Work • Web application security work – Penetration testing tools (black box testing) – Application firewalls (listen on the wire and find patterns) • Practical automatic static error detection tools – WebSSARI (static and dynamic analysis of PHP) [Huang,... ’04] – JDBC checker (analysis of Java strings) [Wasserman, Su ’04] – SQLRand (SQL keyword randomization) [Boyd and Keromytis ’04] Web Application Security Papers 20 15 [Usenix ’05] 10 5 0 2004 2005 2006 [OOSPLA ’05] 49 *Source: http://suif.stanford.edu/~livshits/work/griffin/lit.html Future Work Applying Model-checking to Web applications (Michael Martin) Learning Specification from Runtime Histories (with Naeim) Partitioned BDDs to Scale bddbddb Better (with Jean-Gabriel/Prof. Dill) Analyze Sources of Imprecision in Datalog Analyzing Sanitization Routines Attack Vectors in Library Code Type Qualifiers in Java (with Dave Greenfieldboyce at UMD) Using Model Checking to Break Sanitizers 50 Special Thanks Stella, My parents, My sister Monica Alex, Dan, Dawson, Elizabeth Ramesh Chandra, Darlene Hadding, David Heine, Michael Martin, Brian Murphy, Joel Sandin, Constantine Sapuntzakis, Chris Unkel, John Whaley, Kolya Zeldovich Dzintars Avots, Ron Burg, Mark Dilman, Craig Foster, Chris Kaelin, Amit Klein, Ted Kremenek, Iddo Lev, John Mitchell, Carrie Nielsen, David Pecora, Ayal Pincus, Jai Ranganathan, Noam Rinetzky, Mooly Sagiv, Elena Spector, Jeff Ullman, Eran Yahav, Gaylin Yee, Andreas Zeller, Tom Zimmerman National Science Foundation 51 The End. 52 Griffin Security Project http://suif.stanford.edu/~livshits/work/griffin/ Stanford SecuriBench http://suif.stanford.edu/~livshits/securibench/ Stanford SecuriBench Micro http://suif.stanford.edu/~livshits/work/securibench-micro/ PQL language http://pql.sourceforge.net/ 1. Finding Security Vulnerabilities in Java Applications with Static Analysis, Livshits and Lam, 2005. 2. Finding Application Errors and Security Flaws Using PQL, Martin, Livshits, and Lam, 2005. 3. Defining a Set of Common Benchmarks for Web Application Security, Livshits, 2005. 4. Reflection Analysis for Java, Livshits, Whaley and Lam, 2005. 5. DynaMine: Finding Common Error Patterns by Mining Software Revision Histories, Livshits and Zimmermann, 2005. 6. Locating Matching Method Calls by Mining Revision History Data, Livshits and Zimmermann, 2005. 7. Turning Eclipse Against Itself: Finding Bugs in Eclipse Code Using Lightweight Static Analysis, 2005. 8. Context-Sensitive Program Analysis as Database Queries, Lam, Whaley, Livshits, Martin, Avots, Carbin, Unkel, 2005. 9. Improving Software Security with a C Pointer Analysis, Avots, Dalton, Livshits, M.S. Lam, 2005. 10. Findings Security Errors in Java Applications Using Lightweight Static Analysis, Livshits, 2004. 11. Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs, Livshits and Lam, 2003. 12. Mining Additions of Method Calls in ArgoUML, Zimmerman, Breu, Lindig, and Livshits, 2006.