slides - Computer Science

advertisement
mXSS Attacks: Attacking well-secured
Web-Applications by using innerHTML
Mutations
Presenter: Liu Yin
Computer Science Department
College of William & Mary
1
Outline

Introduction




Problem Description

The innerHTML Property

Mutation
Exploits





XSS
mXSS
Seven attack vectors
Attack Surface
Mitigation Techniques
Evaluation
Conclusion
2
(Cross Site Scripting)XSS

XSS enables attackers to inject client-side script into
Web pages viewed by other users

If the web site allows uncontrolled content to be
supplied by users

User can write content in a Guest-book or Forum.
User can introduce malicious code in the content

Ebay Example


Malicious Code
Modification of the Document Object Model - DOM (change
some links, add some buttons)
 Send personal information to thirds (javascript can send
cookies to other sites)

3
(Cross Site Scripting)XSS
User input
(XSS vectors)
Web
App
server
XSS
Filter
User input
(including
an XSS
vector)
would be
sent to the
server,
Filtered
HTML
XSS
XSS Filter
Excecutes
Browser
4
(mutation-based XSS)mXSS
Browser
Filtered
HTML
User input
XSS
Filter
XSS Filter
innerHTML
Mutation
Web server
XSS
Executes
Server- and client-side XSS filters share the assumption:
their HTML output and the browser-rendered HTML content are
mostly identical
False !
5
mXSS – At the time of testing

Impact on IE, Firefox, Chrome


Webmail Client: Microsoft Hotmail, Yahoo! Mail…
Bypass HTML Sanitizers





HTML Purifier
htmLawed
OWSAP AntiSamy
jSoup
Kses
6
Outline

Introduction




Problem Description

The innerHTML Property

Mutation
Exploits





XSS
mXSS
Seven attack vectors
Attack Surface
Mitigation Techniques
Evaluation
Conclusion
7
The innerHTML Property

An HTML element's property

Creating HTML content from arbitrarily formatted strings
Usage Example

Read access




Serialize HTML DOM nodes into strings
is necessary to trigger the mutation
Write access

attach the transformed malicious content to the DOM.
8
Mutation
The browser


mutates the input string in multiple ways before sending it to the
layout engine




the
the
the
the
empty class is removed
tag names are set to upper-case
markup is sanitized
HTML entities are resolved. <  < or <
innerHTML-access

Core issue


HTML markup an attacker uses to initiate an mXSS attack is considered
harmless
Only the browser will transform the markup internally, thereby unfolding
the embedded attack vector and executing the malicious code.
9
Outline

Introduction




Problem Description

The innerHTML Property

Mutation
Exploits





XSS
mXSS
Seven attack vectors
Attack Surface
Mitigation Techniques
Evaluation
Conclusion
10
Backtick Characters breaking Attribute
Delimiter Syntax

Backtick {`}

A bug report in 2007




innerHTML-access
the attributes delimited by backticks or containing values starting with
backticks
Often the regular quotes disappeared, leaving the backtick characters
unquoted and therefore vulnerable to injections.
Example
<script> imgID.innerHTM=….;</script>
11
XML Namespaces in Unknown Elements
causing Structural Mutation

Unknown attributes


article, aside, menu
xmlns attribute




provide information on which XML namespace the element is supposed to
reside on.
innerHTML-access
The browser prefixes the unknown but namespaced element with the XML
namespace that in itself contains unquoted input from the xmlns attribute.
Example
12
Backslashes in CSS Escapes causing StringBoundary Violation

CSS Escapes
\unicode, \ascii
property: ’v\61 lue’ (property:’value’)
 When innerHTML-accessed
 Browser converted escapes to their canonical representation
 property: ’val\27ue’  PROPERTY: ’val’ue’

13
Misfit Characters in Entity Representation
breaking CSS Strings

CSS escape for double-quote character


the render engine converts them into a single quote
\22, ", " and &#34  ’ upon innerHTML-access.
14
CSS Escapes in Property Names violating entire
HTML Structure

Terminate the style attribute

By escaping the entire attack payload, the adversary can
abuse the mutation feature and deliver arbitrary CSS-escaped
HTML code.

The attack only works with the double-quote representation
inside double-quoted attributes.
15
Entity-Mutation in non-HTML Documents

MIME type





text/xhtml, text/xml, application/xhtml+xml, application/xml
A web-server can instruct a browser to render a document in XHTML/XML
by setting a matching MIME type via Content-Type HTTP headers;
MIME-type dependent parser behaviors  anomalies
in text/html  cannot happen
in text/xhtml and various related MIME type rendering modes, a CSS style
element is supposed to be capable of containing other markup elements.
16
Entity-Mutation in non-HTML context of
HTML documents

SVG tag, fixed
17
Outline

Introduction




Problem Description

The innerHTML Property

Mutation
Exploits





XSS
mXSS
Seven attack vectors
Attack Surface
Mitigation Techniques
Evaluation
Conclusion
18
Attack Surface

A mutation event occur when

Found 74.5% of the Alexa Top 1000 websites to be
using inner-HTML-assignments.
JavaScript libraries



65% of the top 10,000 websites
48.87% using jQuery
19
Attack Surface

Web-mailers





HTML Rich-Text Editors (RTE)  innerHTML property
triggered with almost any interaction : composing,
replying, spell-checking
analyzed and spotted mXSS vulnerabilities in Microsoft
Hotmail, Yahoo! Mail, Rediff Mail, OpenExchange, Roundcube
Bug reports were acknowledged
HTML sanitizer


Add new rules for known mutation effects
challenging to develop new filtering paradigms that may
discover even unknown attack vectors.
20
Outline

Introduction




Problem Description

The innerHTML Property

Mutation
Exploits





XSS
mXSS
Seven attack vectors
Attack Surface
Mitigation Techniques
Evaluation
Conclusion
21
Mitigation Techniques

Server-side mitigation

Policy: disallow any of the special characters for which
browsers are known to have trouble with when it comes to
a proper conversion.

refine policy for HTML,CSS, implemented to HTML Purifier

solely practical for the handling of a subset of HTML

cannot protect against dynamically generated content
22
Mitigation Techniques

Client-side mitigation

TrueHTML, javascript

wrapping and sanitation process
 overwrite the handlers of innerHTML to intercept the
performance optimization and the markup mutation process.


free from all mutations described and documented
performance impact is low, does not require additional
developer effort
23
Outline

Introduction




Problem Description

The innerHTML Property

Mutation
Exploits





XSS
mXSS
Seven attack vectors
Attack Surface
Mitigation Techniques
Evaluation
Conclusion
24
Evaluation Environment


TrueHTML
Overhead





Access 5,000 URLs randomly chosen from Alexa top
10,000 most popular web sites
In typical usage scenarios: displaying an e-mail in a web
mailer, accessing popular websites
investigate the relation between page load time overhead
and page size in a controlled environment.
Demonstrate versatility: used different hardware platforms
for the different parts of the evaluation
Evaluation environment

completed by a proxy server to inject TrueHTML into the
HTML context of the visited pages, and a logging
infrastructure.
25
Evaluation Result
user-perceived page load time is not only dependent on the size of the
content, but also reliant on the structure and type of the markup.
How True- HTML performance overhead relates to content size and the
amount of markup elements?
26
Evaluation in a controlled environment

Create pages containing one element with 1kB text content
 <p>…(1kb)…</p>
 assigned document.body.innerHTML between 1 and 100 times

Scale to 1,000 elements
27
Outline

Introduction




Problem Description

The innerHTML Property

Mutation
Exploits





XSS
mXSS
Seven attack vectors
Attack Surface
Mitigation Techniques
Evaluation
Conclusion
28
Conclusion




Described a novel attack technique based on a
problematic and mostly undocumented browser
behavior
Analyzed the attack surface and propose an action
plan for mitigating the dangers
Supplied research-derived evaluations of the
feasibility and practicability of the proposed
mitigation techniques.
Insights


Defensive tools and libraries must gain awareness of the
additional processing layers that browsers possess.
“Well-formed HTML is unambiguous” is false
29
End
Thanks!
Q&A
30
Download