CROSSING ORIGINS BY CROSSING FORMATS Jonas Magazinius, Andrei Sabelfeld – Chalmers University of Technology Billy K. Rios – Cylance Inc. ABOUT • PhD Student, Chalmers • until Nov 1st then Dr. Magazinius • Securing the mashed up web • 10:00 HA4 – Hörsalsvägen, Chalmers • Co-leader of OWASP Gothenburg • Part of Cure53 • @internot_ • Father – as some of you might remember LANGUAGE-BASED SECURITY • Using programming language theory for finding and mitigating security vulnerabilities • Static vs. dynamic analysis • Information-flow monitoring • Declassification • Decentralized • Crossing origins by crossing formats • Byproduct of research • Joint work with Billy K. Rios • Greatly inspired by the work of Julia Wolf BACKGROUND • GIFAR – content smuggling attack • Billy Rios (@XSSniper), Petko D. Petkov (@pdp) • Attacker uploads GIF/JAR file • Cross-origin CSS attack • Chris Evans (@scarybeasts) et al. • Attacker injects fragments of CSS into HTML • Content-type sniffing attacks • Adam Barth (@adambarth) et al. • Attacker uploads PS/HTML file THINGS IN COMMON… • … mixing formats • … re-interpretation of the content POLYGLOT • Definition: • ”…a person who speaks several languages.” • ”…a program that is valid in multiple programming languages.” • Content that can be interpreted as multiple formats • Example 1 – HTML / JavaScript • data:text/html,alert('<script src="%23"></script>') • Example 2 – C / Pascal / PostScript / TeX / Bash / Perl / Befunge98 • (*a/*/ % #)(PostScript)/Helvetica 40 selectfont 9 400 moveto show%v"f"a0 true showpage quit%#) 2>/dev/null;echo bash;exit #*/);int main()/*>"eb"v %a*0)unless print"perl\n"__END__*/{printf("C\n");/*>>#;"egnu">:#,_@;,,,< *)begin writeln(*\output={\setbox0=\box255}\eject\shipout\hbox{\TeX}\end *)('pascal');end.{*/return 0;} MALICIOUS POLYGLOTS • Two formats (or more) • One benign • One malicious • GIFAR – GIF/JAVA • Cross-origin CSS – HTML/CSS • Content-type sniffing – PS/HTML • Preferred format characteristics • Widespread, commonly used format • Error tolerant parsing, or other ways to hide foreign syntax • Cross-origin communication POLYGLOT ATTACKS • • • Infiltrate • Syntax injection – Cross-origin CSS attack • Content smuggling – GIFAR Embed • Context based re-interpretation • The content-type provided by the server is overridden Tags that allow re-interpretation of content: • CSS – <link>-tag • Java – <applet>-tag • Content sniffing – <iframe>-tag • <object> and <embed> allows arbitrary interpretation based on type attribute ATTACK VECTORS – SYNTAX INJECTION • A vulnerable webservice reflects parameters into content • Fragments of syntax is injected resulting in a polyglot • Polyglot is embedded under the origin of the attacker • The polyglot has origin of, and can communicate with vulnerable service • Visitors of the attackers domain are exploited • Known attack instances (1) (2) • Cross-origin CSS attack attacker.com • (Cross-site scripting) (3) (4) vulnerable.com ATTACK VECTORS – CONTENT SMUGGLING • A vulnerable webservice allows users to upload content • Attacker uploads a polyglot to the vulnerable origin • Polyglot is embedded under the origin of the attacker • The polyglot has origin of, and can communicate with vulnerable service • Visitors of the attackers domain are exploited • Known attack instances (2) (3) • GIFAR attacker.com • Content sniffing attack (4) (1) (5) vulnerable.com PAYLOADS – EXPLOITING THE ORIGIN • Cross-origin information leakage • Request sensitive user information • Leak to attacker across origins • Cross-site request forgery • Traditionally, issue requests with the credentials of the victim • Protect using tokens • Impact is far greater if it is possible to read the response • Extract token • Make request PORTABLE DOCUMENT FORMAT • • Standardized document format – ISO32000-1 Container format • Embed related resources • Contain foreign syntax by design • Error tolerant parsing • Powerful capabilities • Display text • Render 2D/3D graphics • Animations • Forms • Launch commands (restricted) • Execute JavaScript • Embed Flash – just fantastic • Issue HTTP-request • With cookies!! DOCUMENT STRUCTURE • Header • %PDF-1.7 • xref Objects 00000012 0000 n 1 0 obj << /Length 14>> stream Content stream Cross-reference endxref • Trailer endstream • startxref 105 endobj • trailer << /Root 1 0 R >> • %%EOF MINIMAL PDF (ACCORDING TO SPECIFICATION) %PDF-1.4 1 0 obj<< /Type /Catalog /Outlines 2 0 R /Pages 3 0 R >> endobj 2 0 obj<< /Type Outlines/Count 0>> endobj 3 0 obj<< /Type /Pages /Kids [4 0 R] /Count 1 >> endobj 4 0 obj<< /Type /Page /Parent 3 0 R /MediaBox [0 0 612 792] /Contents 5 0 R /Resources << /ProcSet 6 0 R >>>> endobj 5 0 obj<< /Length 35 >>stream endstream endobj 6 0 obj[/PDF] endobj xref 07 0000000000 65535 f 0000000009 00000 n 0000000074 00000 n 0000000120 00000 n 0000000179 00000 n 0000000300 00000 n 0000000384 00000 n trailer<< /Size 7 /Root 1 0 R>> startxref 408 %%EOF MINIMAL PDF (ACCORDING TO INTERPRETER) Adobe Reader Google Chrome PDF Reader %PDF-1. trailer<</Root<</Pages<<>>>> %PDF 1 0 obj<</Pages<<>>>> trailer<</Root 1 0 R>> …or executing JavaScript… …or even shorter… %PDF-1. trailer<</Root<</Pages<<>> /OpenAction<</S/JavaScript /JS(app.alert(’PDF’))>> >> %PDF trailer% 1 0 obj <</Root 1 0 R/Pages<<>>>> …or even shorter… %PDF trailer<</Root% 1 0 obj<</Pages 1 0 R>> ERROR TOLERANT PARSING This text would also be a valid %PDF-1. With the condition that the trailer %begins on a new line and that there isn’t <</too /much /garbage /in /Root<</Pages<<>>>> the dictionary. COMMUNICATION • • PDF • URL Action – Redirects the browser Embedded Flash • Inherits the origin of the document • Two-way communication • Uses its own set of cookies %PDF-1. trailer <</Root <</Pages<<>> /OpenAction <</S/URI/URI(javascript:alert(location))>> >>>> • JavaScript • Inherits the origin of the document • Uses the cookies of the browser • launchURL() – Redirects the browser • getURL() – Redirects the browser • submitForm() – POST request via the browser • XML External Entity • Two-way communication • Patched in latest version of Adobe Reader (FINALLY) PDF POLYGLOTS Syntax injection Content smuggling • Easy to inject • Mixes well with just about any format • Token-set overlaps with HTML • Server can verify benign format • Impact • Context dependent • • Can extract sensitive information • CSRF protection token • CSRF • User information • Cross-origin leakage Impact • CSRF • Cross-origin leakage PDF-BASED SYNTAX INJECTION ATTACK PDF-BASED CONTENT SMUGGLING ATTACK POTENTIAL TARGETS Syntax injection • User supplied content reflected • XSS vulnerabilities • JSON • XML Content smuggling • PDF as the malicious format • User provided content of any kind • PDF as the benign format • CV database • Conference systems DEMO http://internot.noads.biz EVALUATION • Syntax injection • Approach • Alexa top100 • Results • Content smuggling • Approach • Results • Responsible disclosure ALEXA TOP100 MITIGATION APPROACHES Forward notification approach • Determine context • Send expected content-type as header • Content-Type: application/pdf • Content-Type: image/* • Server decides whether content matches expected content-type • Gives server control the interpretation of contents • Error code (404, 500) • Alternate content MITIGATION APPROACHES Server side (application) • • Syntax injection • Filtering? In general, no! Content-smuggling • Serve content from a sandboxed domain (googleusercontent.com) Client side • • Browser • Strict enforcement of server provided content-type • Disallow type-attribute Interpreter • Strict(er) parsing? • Limit communication methods PDF MITIGATION APPROACHES Server side • Filtering • PDF tokens and keywords { <, >, trailer } • Content Security Policy • DO NOT!!! Client side • Improvements in latest version • Matching first bytes against know magic values • Already found a bypass! • Limit worst communication method DO NOT!!! Content-Disposition: attachment; filename="fname.ext” Content-Type: application/octet-stream ”If this header is used in a response with the application/octet- stream content-type, the implied suggestion is that the user agent should not display the response, but directly enter a `save response as...' dialog.” • This is NOT respected by Adobe Reader SUMMARY • Polyglot attacks – New breed of cross-origin attacks • Syntax injection • Content-smuggling • PDF-based polyglot attacks • Flexible error tolerant format • Powerful beyond necessity • Mitigation approaches • Forward notification approach • Specific approaches THANK YOU! CROSS-ORIGIN CSS ATTACK • Minimal amount of CSS-syntax injected in target HTML-page • {}#f{font-family:’ • … arbitrary HTML content … • ’} • Attacker uses HTML-page as style-sheet in his page • Victim visits attackers page • Attacker can extract the arbitrary content from imported style-sheet GIFAR – CONTENT SMUGGLING ATTACK • GIF-image • Parsed top-down, content after trailer ignored • JAR-file • Based on ZIP-archives • Parsed bottom-up, content before header ignored • GIF + JAR = GIFAR • copy /b benign.gif + malicious.jar gifar.gif • The GIFAR is uploaded to a vulnerable service, • The GIFAR is embedded from the vulnerable service on attackers page as an applet • Any visitor to the attackers page will execute the applet CONTENT SNIFFING ATTACK • Browser performs content sniffing when server provides unknown content-type • Content is matched against a series of signtures • If a match is found the content is interpreted as the matched type • Attacker creates a “chameleon” file • Benign format + HTML • The file is crafted to match HTML signature • The chameleon is uploaded to a vulnerable service • The chameleon is embedded in an iframe on the attackers page • Any visitors will trigger the content sniffing and render the HTML