SCRIPTGARD Automatic Context-Sensitive Sanitization for Large-Scale Legacy Web Applications Prateek Saxena UC Berkeley David Molnar Microsoft Research Ben Livshits Microsoft Research Large-Scale Legacy Applications • Step-up in Scale How to Secure Legacy Apps? – Half a Million LOC – Shared Development by teams of 100+ • What’s The Difference? – Shifting Platforms isn’t practical – Long Program Paths, Many sanitizers Applied 2 XSS in Large-Scale Applications String Img.RenderControl() { Write(userimg); Write(Sanitize(userimg)); } Small-Scale Apps Large-Scale Applications • Buggy Sanitizer • Missing Sanitization • New Sanitization Errors – [Pixy’06, PhpTaint’06,Cqual’04, – [CCS’11] • SCRIPTGARD Merlin’09,Securifly’05, PhpAspis’11, Saner’08, Bek’11] 3 Contributions • Does Sanitization Defense Fail In Practice? – 7 Commercial Applications, 400 KLOC • 2 New Classes of Errors in Sanitizer Use – How Often & Why • SCRIPTGARD: Automated Sanitizer Use Analysis Legacy .NET Minimal Specs Concrete Test Cases Can Auto-Correct Sanitization During Deployment 4 Error #1: Context-Mismatched Sanitization(CMS) <img 1207 src="sunset.gif" height="right"> <a href=“javascript: document.write(‘…’);”> Diapers </a> <script> var name=‘Stewie’; </script> HtmlEncode 23904 HTML Tag Context JSStringEncode JS String Context 1,207 (4.7%) are CMS errors! \r\n; alert(document.cookie); Which Sanitizer To Apply Where? 5 Why Does Context-Mismatch Happen? San Context is a Global Path-Sensitive Property But, developers select Sanitizers Locally Output Sink 6 Error #2: Inconsistent Multiple Sanitization(IMS) Attack Input San 1 San 2 San 1 Does the Order Matter? San 2 Safe? Output Sink Safe? 7 Inconsistent Multiple Sanitization(IMS): Does it Really Happen? Attack Input 285 2960 285 (8%) of multiple HtmlEncode JSStringEncode sanitizations are errors! JSStringEncode 21964 HtmlEncode 8 Why Does IMS Happen? <script> document.write (‘ <a href=" userlink "></a>’); </script> SERVER - SIDE OUTPUT Output Sink 9 Why Does IMS Happen: Nested Contexts <script> document.write (‘ <a href=" userlink "></a>’); </script> JS Parser JS Unicode Decode JS String Context URL Attribute Context \u0022 " &quot; " HTML Parser Html-Entity Decode 10 Why Does IMS Happen: Nested Contexts " Wrong Sanitizer Order Correct Sanitizer Order Nested Contexts Cause Developer Confusion! JS Parser JS Unicode Decode \u0022 " \u0026quot; &quot; HTML Parser Html-Entity Decode 11 How Common Are Nested Contexts? 1093 104 2948 1 2 3 4 16949 Nesting Depth: Up to 4 12 Take-Aways… Small-Scale Apps • Buggy Sanitizer • Missing Sanitization – [Pixy’06, PhpTaint’06,Cqual’04, Merlin’09,Securifly’05, PhpAspis’11, Saner’08, Bek’11] Large-Scale Applications • Shared Paths lead to… • CMS & IMS • Developers apply correct sanitizers wrongly 13 How Do We Find Sanitization Errors In Legacy Applications At Scale? 14 SCRIPTGARD Analysis Legacy .NET SCRIPTGARD HTTP Requests Sanitizer Specification Instrumented Server-side DLLs Inconsistently Sanitized Test Cases 15 SCRIPTGARD Analysis: Key Ideas Path-Sensitive Positive Taint-Tracking Determine Contexts Path 1 Path 2 Browser Model Path 3 Path 4 SCRIPTGARD Analysis: Key Ideas Path-Sensitive Positive Taint-Tracking Determine Contexts Trusted? Sanitizer Sequence Path 1 + HtmlAttributeEncode, JSStringEncode Path 2 HtmlEncode, JSStringEncode Path 3 + HtmlAttributeEncode Path 4 JSStringEncode, HtmlEncode CMS IMS 17 Precise Context Determination: Browser Parser Model Contexts T 18 How Can We Correct Sanitization Errors Automatically? 19 SCRIPTGARD: Can We Auto-Patch Sanitization Errors? • The Bad News: Large slowdown • Observation: Less than 10% paths problematic Can We Detect When A Problematic Path Is Executed? • Yes! – Preferential Path Profiling [POPL’06] – Negligible Overhead 20 SCRIPTGARD Auto-Correction SCRIPTGARD Pre-Release Analysis Sanitization Cache Sanitizer Patch Sanitizer Patch Deployment Preferential Path Profiler Server Code With Light-weight Instrumentation 21 Conclusions • 2 New Patterns of Errors in Sanitizer Use 1207 285 Inconsistent Multiple Sanitization • SCRIPTGARD ContextMismatched Sanitization 23717 – Effective Analysis Tool – Auto-Correction with Negligible Overhead 22 You have been a wonderful audience …you stayed… Prateek Saxena http://www.cs.berkeley.edu/~prateeks/ 23 Sanitizer Correction is Challenging San HtmlEncode Can We Just Replace HtmlEncode with another Contexts Vary By Path Executed Sanitizer? San Output Sink 24