A Symbolic Execution Framework for JavaScript

advertisement
A Symbolic Execution Framework
for JavaScript
Prateek Saxena
Devdatta Akhawe
Steve Hanna
Feng Mao
Stephen McCamant
Dawn Song
UC Berkeley
1
Motivation: Rich Web Applications
• Client-side JS complexity in Rich Web Applications
• High cross-domain client-side data exchange
• Need tools to analyze complex applications
2
An Important Application:
Finding Client-side Code Injection Bugs
• An Example
• Several Client-side Data Exchange Channels for mashups
Many attack vectors …..
<IMG SRC="javascript:alert('XSS')”>
<IMG SRC=JaVaScRiPt:alert('XSS')>
Data: “friendName: Joe,msg: Yo!”
facebook.com
Data: “..msg: <img src=s onerror=javascript:alert(..”
FRAGMENT ID
http://cnn.com ?friendName=Joe #msg=Yo!
postMessage
var DataStr = ’var new_msg =(’ +event.data+’);’;
ParseData(DataStr);
Parse the Input
var regx = /<script.*>.*?<\/script>/g;
if (regex.test(DataStr.msg)) { return false; }
Validation checks
n.innerHTML = DataStr.msg;
Dynamic HTML update3
Problem Definition
Automatically Find Code-Injection
Vulnerabilities in JS Applications
• Two challenges
• #1: Automatic exploration of the execution space
• #2: Automatically check if data is sanitized sufficiently
– Can’t distinguish parsing ops. from custom validation checks
– Can’t assume validation, false negatives vs. false positives.
4
Our Contributions
• Existing Approaches
– Static Analysis [Gatekeeper’09, StagedInfoFlow ‘09]
– Taint-enhanced blackbox fuzzing [Flax’10]
• Drawbacks
– Either assumes an external test suite to explore paths [Flax’10]
– Or, does not generate an exploit instance, can have FPs
[Gatekeeper’09, StagedInfoFlow ‘09]
• Our Contributions
–
–
–
–
A Symbolic Analysis approach
Kudzu: An end-to-end symbolic execution tool for JavaScript
Identify a sufficiently expressive “theory of strings”
Kaluza: A new expressive, efficient decision procedure
» Supports strings, integers and booleans as first-class input
5
variables
Outline
•
•
•
•
•
Problem Definition
Previous Approaches vs. Our Approach
Kudzu System Design
Kaluza Decision Procedure
Evaluation
6
Outline
•
•
•
•
•
Problem Definition
Previous Approaches vs. Our Approach
Kudzu System Design
Kaluza Decision Procedure
Evaluation
7
Kudzu: Approach and Design
• Input space has 2 components
– Event Space: GUI explorer
– Value Space: Dynamic Symbolic Execution
• Checking sufficiency of validation checks
– Symbolic analysis of validation operations on code-evaluated data
NEW INPUT FEEDBACK
GUI
EXPLORER
DYNAMIC
SYMBOLIC
INTERPRETER
KALUZA
DECISION
PROCEDURE
APPLICATION-AGNOSTIC
APPLICATION-SPECIFIC
CHECKING SUFFICIENCY
OF VALIDATION
8
Dynamic Symbolic Interpreter for JavaScript
• Employed for Value Space Exploration
New
Input
Initial
Input
Symbolic
Formula
f
f'
KALUZA
DECISION
PROCEDURE
Program
Concrete Execution
Symbolic Execution
9
Checking Sufficiency of Validation Checks
• To eliminate false positives
Attack Grammar
Specification
INITIAL
INPUT
CODE
EVALUATION
CONSTRUCT
KALUZA
DECISION
PROCEDURE
I Attack
INTERSECTION
EMPTY
If
10
GUI Exploration
•
•
•
•
Events: State of GUI elements, mouse and link clicks
Event Sequence: A sequence of state-altering GUI actions
Event Space Exploration using a GUI explorer
Practically enhances coverage benefits
– Example:
– 1 Gadget Vulnerability: reachable with a sequence of events
executed: dropdown box value is changed, delete hit
11
Outline
•
•
•
•
•
Problem Definition
Previous Approaches vs. Our Approach
Kudzu System Design
Kaluza Decision Procedure
Evaluation
12
Empirical Motivation for A Theory of Strings
split /
match /
test
(1%)
concat
(8%)
substring / charAt /
charcodeAt
(5%)
– Combined string and integer solver
replace /
decodeURI /
encodeURI
(8%)
indexOf/
lastIndexOf /
strlen
(78%)
– Regular Expression based
operations are 1/3rd of the match,
split, test, replace
operations (9%)
– Multiple string variables
/\\(?:["\\\/bfnrt]|u[0-9a-fA-F]{4})/
33% regexes have Capture Groups
13
A Sufficiently Expressive Theory for JS
• Practical Requirements to support
[DPRLE’09]
Concatenation (Word Equations)
Regular Language Membership
String Length
Equality
Multiple String Variables
Boolean and Integer Logic






[HAMPI’09]






[PEX’09]






Existing solvers not sufficiently expressive
14
Kaluza: A New Solver Decision Procedure
• Input: A boolean combination of constraints over multiple
integer and variable-length string variables
STRING
• Decidability vs Expressiveness
– Equality between reg language
variables undecidable [STOC’81]
– Full generality of replace
in word constraints undecidable
[TACAS’09]
SOLVING
APPROACHES
LANGUAGE
EQUATIONS
WORD
CONSTRAINTS
Insight: JS to Kaluza Reduction uses Dynamic Information
JavaScript
Language
Operations
Kaluza
Core
Constraints
15
Outline
•
•
•
•
•
Problem Definition
Previous Approaches vs. Our Approach
Kudzu System Design
Kaluza Decision Procedure
Evaluation
16
Kudzu System Evaluation
• 18 Live Applications
– 13 iGoogle gadgets
– 5 AJAX application
» Social networking: Academia, Plaxo
» Chat applications: AjaxIM, Facebook Chat,
» Utilities: parseURI
• Setup
– Untrusted sources
» All cross-domain channels
» Text boxes
– Critical sinks
» Code evaluation constructs
17
Results: Summary
• Summary
– Kudzu found 11 code injection vulnerabilities automatically
– 2 previously unknown vulnerabilities
– 6 hours of testing period
• Examples
– XSS in Facebook Connect used by 2 social networking sites
– Gadget Overwriting Attacks on Google/IG
– Self-XSS on AjaxIM
• No false positives
• Finds all known vulnerabilities in our benchmarks [Flax’10]
18
Results: Code Coverage
29% code coverage increase in 6 hours
Initial Discovered
Initial Executed
Total Discovered
Total Executed
19
Results: Code Coverage
29% code coverage increase in 6 hours
Code Coverage (in %)
100
90
Coverage Increase
Initial Coverage
80
70
60
50
40
30
20
10
0
20
Conclusion
• Kudzu: An End-to-end Symbolic Execution Tool for JS
– Separates the input space analysis into 2 components
• Identified a theory of strings expressive enough for JS
• Kaluza: A new decision procedure for the theory
• Demonstrated capabilities on 18 live web applications
• Found 11 vulnerabilities with no given initial test harness
• 2 new vulnerabilities
21
Contact
• Contact:
– Prateek Saxena (prateeks@cs.berkeley.edu)
• Kaluza, our core constraint solver is online:
– http://webblaze.cs.berkeley.edu/2010/kaluza
• Please visit Webblaze, our web security research page
– http://webblaze.cs.berkeley.edu
THANKS FOR
COMING TO THE TALK
22
Reduction of JS Operations:
Mixed Concrete and Symbolic Power
• Example: replace full generality is undecidable
• Concretize number of occurances of matched string
rep1 = INPUT.replace(/\\(?:["\\\/bfnrt]|u[0-9a-fA-F]{4})/g, "@");
Symbolic
operations
R
S0
Regex
Membership
over T1..T3,
S1..S3
S1
S2
T1
T2
@
Concat
@
S1
@
INPUT
R
T3
@
S0
S3
@
S2
@ S3
OUTPUT
23
Results: Solver Performance
SAT cases: < 1sec, UNSAT 1-50 secs
24
Comparison of Symbolic Execution Alone
with GUI Exploration
• Symbolic Execution Alone vs. Full-featured Kudzu
Symbolic Execution
Alone
Full-featured
Kudzu
25
Example Attacks: Gadget Overwriting
Legitimate URL bar
<Attack Link to IGoogle page>
Compromised
Gadget with
Overwritten Contents
26
Download