View/Open - Sacramento

advertisement
QUERY RE-EVALUATION FOR HANDLING SQL INJECTION ATTACKS
Xiaoying Shen
B.S., Donghua University, China, 2000
PROJECT
Submitted in partial satisfaction of
the requirements for the degree of
MASTER OF SCIENCE
in
COMPUTER SCIENCE
at
CALIFORNIA STATE UNIVERSITY, SACRAMENTO
FALL
2011
QUERY RE-EVALUATION FOR HANDLING SQL INJECTION ATTACKS
A Project
by
Xiaoying Shen
Approved by:
__________________________________, Committee Chair
Ying Jin, Ph.D.
__________________________________, Second Reader
Jinsong Ouyang, Ph.D.
____________________________
Date
ii
Student: Xiaoying Shen
I certify that this student has met the requirements for format contained in the University
format manual, and that this project is suitable for shelving in the Library and credit is to
be awarded for the Project.
__________________________, Graduate Coordinator
Nikrouz Faroughi, Ph.D.
Department of Computer Science
iii
________________
Date
Abstract
of
QUERY RE-EVALUATION FOR HANDLING SQL INJECTION ATTACKS
by
Xiaoying Shen
Most modern web applications rely on retrieving updated data from a database. In
response to a request from a web page, the application will generate a SQL query, and
often incorporate portions of the user input into the query. SQL injection refers to
injecting crafted malicious SQL query segments to change the intended effect of a SQL
query. The hacker could access unauthorized data, or even gain complete control over the
web server or back-end database system. SQL injection attack has become one of the top
web application vulnerabilities.
In this project, I surveyed different types of SQL injection attacks and the corresponding
countermeasure strategies proposed by other researchers. A new technique to detect and
prevent SQL injection attacks is presented; the basic idea is to insert a validation process
between the generation of SQL query and the query execution. The technique consists of
both static analysis of web application code and runtime validation check of dynamically
generated SQL query. Following four steps are involved: Identify hotspot; analyze SQL
iv
query; initialization; and runtime validation check. The project was implemented using
JAVA. Performance evaluation was also conducted.
_______________________, Committee Chair
Ying Jin, Ph D.
_______________________
Date
v
ACKNOWLEDGMENTS
I would like to thank my advisor Dr Ying Jin, for the guidance that she has provided throughout
this project.
I also want to thank my second reader Dr. Jinsong Ouyang for taking time read the report and
giving me advice.
I would like to thank my family for their continuous support during my graduate study.
vi
TABLE OF CONTENTS
Page
Acknowledgements ............................................................................................................ vi
List of Tables ..................................................................................................................... ix
List of Figures ..................................................................................................................... x
Chapter
1. INTRODUCTION .......................................................................................................... 1
2. SQL INJECTION DEFINITION AND BACKGROUND ............................................. 3
2.1 Background of SQL Injection ................................................................................... 3
2.2 Different Types of SQL Injection ............................................................................. 6
2.2.1 SQL Manipulation ............................................................................................. 6
2.2.2 Code Injection .................................................................................................... 7
2.2.3 Fingerprinting and Enumeration ........................................................................ 8
2.2.4 Denial of Service Attack .................................................................................... 8
3. DETECTION AND PREVENTION OF SQL INJECTION ATTACKS ....................... 9
3.1 Defensive Coding ..................................................................................................... 9
3.1.1 Prepared Statement ............................................................................................ 9
3.2 Detection and Prevention Techniques .................................................................... 10
3.2.1 Static Analysis ................................................................................................. 10
3.2.2 Combine Static and Runtime Analysis ............................................................ 11
vii
3.2.3 Runtime Analysis ............................................................................................. 12
3.2.4 Using Intrusion Detection System ................................................................... 12
4. IMPLEMENTATION ................................................................................................... 14
4.1 System Architecture ................................................................................................ 14
4.1.1 Identify Hotspots .............................................................................................. 15
4.1.2 SQL Query Analyzer ....................................................................................... 16
4.1.3 Validation Check Initialization ........................................................................ 18
4.1.4 Runtime Validation Check ............................................................................... 19
4.2 Implementation Discussion ..................................................................................... 20
4.2.1 Hash Function .................................................................................................. 20
4.2.2 Rolling Key ...................................................................................................... 23
4.2.3 Building Index for encryptionT Table ............................................................. 25
4.2.4 Trigger.............................................................................................................. 25
4.3 Performance Evaluation .......................................................................................... 28
5. CONCLUSION ............................................................................................................. 30
Appendix Source Code ..................................................................................................... 31
Bibiography....................................................................................................................... 40
viii
LIST OF TABLES
Tables
Page
1.
user_tb table ...................................................................................................... 19
2.
encryptionT table .............................................................................................. 19
ix
LIST OF FIGURES
Figures
Page
1.
Three-tiered Web Application Architecture ............................................................ 3
2.
Percent of Web application Vulnerabilities ............................................................. 5
3.
Prepared Statement Example................................................................................. 10
4.
User Login Code in JAVA; Hotspot is Bolded ..................................................... 15
5.
Wrap the Hotspot with Additional Validation Check ........................................... 16
6.
XML Parse Tree Generated by General SQL Parser .......................................... 167
7.
Java Code Implementing HMAC Function ........................................................... 22
8.
Java Code Implementing a Task to be Scheduled ................................................. 23
9.
Java Code Implementing Scheduling a Task ........................................................ 24
10.
Create encryptionT table with index ..................................................................... 25
11.
Query encryptionT Table with Index .................................................................... 25
12.
Create a insertTrigger on user_tb .......................................................................... 26
13.
Create a deleteTrigger on user_tb ......................................................................... 27
14.
Create a updateTrigger on user_tb ........................................................................ 27
15.
Runtime Response Time ....................................................................................... 28
x
1
Chapter 1
INTRODUCTION
As the use of internet growing rapidly in recent years, database driven web application is
widely used in all types of area, including large and small companies, government
agencies, universities and institutions. Network security is of utmost importance.
Confidentiality, Integrity, and Availability are three fundamental objectives of security.
Confidentiality refers to limiting information access and disclosure to authorized users
with sufficient privileges and preventing access by or disclosure to unauthorized ones.
When information is accessed by someone who are not authorized to do so, the result is
known as loss of confidentiality. Integrity means that data cannot be modified
undetectable. When information is modified in an unexpected way, the result is known as
loss of integrity. Availability ensures that the resource is available when it is needed.
SQL injection attack refers to injecting crafted malicious SQL query segments to change
the intended effect of a SQL query. It can result in loss of confidentiality, integrity and
availability. SQL injection attack allows the attackers to access unauthorized data, or
even gain complete control over the web server or back-end database system. Because
the database typically contains the critical data for the application, it is very attractive for
the attacker. In addition, the web application code contains vulnerabilities. SQL Injection
attack has become one of the top ten web application vulnerabilities.
2
This project presents our approach to handle SQL Injection attacks. In the first phase of
this project, I have studied different forms of SQL injection attacks, as well as the
prevention and detection techniques proposed by other researchers. In the second phase,
a new technique to detect and prevent SQL injection attack is presented. The basic idea
is to insert a validation process between the generation of SQL query and the query
execution. After the system architecture is built, the implementation detail of each
individual step is discussed. The system performance is also evaluated at the end.
This report is organized as follows. Chapter 2 introduces the background and different
types of SQL injection attacks. Chapter 3 gives a survey on related work that has been
done and the countermeasure techniques proposed by other researchers. In Chapter 4, our
approach to detect and prevent SQL injection attack is presented. Chapter 5 concludes the
project and proposes future work plan.
3
Chapter 2
SQL INJECTION DEFINITION AND BACKGROUND
SQL Injection is a code injection technique. By injecting illegal content to the SQL
query, attacker can gain unauthorized access to the backend database of the web
application and retrieve, modify, or delete the data stored on the databases.
2.1 Background of SQL Injection
Figure 1 shows typical three-tiered web application architecture.
Figure 1 Three-tiered Web Application Architecture
4
The client tier runs only the user interface on the end user computer. Middle tier is the
application server, which runs the business logic, processes the data, and queries the
backend database. The backend database server stores information for the web
application. Web pages present to users with dynamically generated content. Based on
user input, the web application retrieves and shows data to the user interface. For
example, in an online library application, two users search for books and they will get
different search results. The application server queries database to access the data. The
interaction is normally implemented with a general-purpose programming language, such
as Java, and through an application-programming interface (API), such as JDBC. To
simplify it, the web application gets input from users and incorporates it into SQL queries
to query the underlying database.
Based on the Web Application Security Statistics Project sponsored by Web Application
Security Consortium (WASC), about 49% of web applications contain vulnerabilities of
high risk level. SQL Injection attack is one of the top 10 Web Application Security Risks.
Figure 2 [1] shows that among all web application vulnerabilities, about 7% of the
vulnerabilities are caused by SQL injection attack.
5
Figure 2 Percent of Web application Vulnerabilities
SQL injection refers to a class of code-injection attacks in which data provided by the
user is included in a SQL query in such a way that part of the user input is treated as SQL
code. The causes of SQL injection is mainly the following two reasons. First SQL queries
are constructed by incorporating user input, if the user input is not handled properly, it
can cause serious system vulnerability. Second, the underlying database often contains
sensitive and confidential information, which is very attractive to the attacker.
Using SQL injection, the attacker may extract, insert or modify data in the database;
bypassing authentication; performing privilege escalation; performing database fingerprinting to discover the type and version of database that the web application is using; or
6
performing denial of service attack. SQL injection can cause very severe consequence.
On April 2011, attackers from Japan and China used SQL injection to gain access to
customers’ credit card data from Neo Beat, an Osaka based company. The theft affected
12,191 customers.
The attack can be targeted at all types of database server, including Oracle, Microsoft
SQL Server, MySql. The main reason of SQL injection vulnerabilities is insufficient
validation of user input.
2.2 Different Types of SQL Injection
In this section, I present four main types of SQL Injection attacks.
2.2.1 SQL Manipulation
This is the most common type of SQL Injection attack. The attacker manipulates the
SQL statement by changing the where clause [2].
The following SQL statement can be used to check user authentication in a web
application, if there is any row returned that means the user is authenticated.
SELECT * FROM users WHERE username =’aUser’ and PASSWORD = ‘aPassword’
The attacker may manipulate the SQL statement; put a tautology to disable password
verification; enter value “’ or 1 =1 --” as input for password, then the original SQL
statement becomes to :
7
SELECT * FROM user WHERE username = ‘or 1 =1 -- ’ and PASSWORD = ‘’
Based on operator precedence, it is always true for the where clause. When above SQL
query is executed all the information in user table will be returned. So the attacker can
gain unauthorized access to the application without a valid username and password.
2.2.2 Code Injection
Code injection attack attempts to add additional SQL statement to the existing SQL
statement [2]. This type of attack works only if the database supports multiple SQL
statements execution per database request. Oracle does not support this, the code
injection attack against an Oracle database will result in an error. It is frequently used
against Microsoft SQL Server application.
One way to perform this type of attack is by inserting a UNION query into the SQL
statement. For example, if the attacker enters “’UNION SELECT * from sometable --“
for the username, the query will become following:
SELECT * FROM users where username = ‘’ UNION SELECT * FROM sometable --‘
AND password = ‘anypassword’;
The first query returns null set, and the second query will return data from sometable.
The database union the result set of these two queries and return them to the application.
Without a legitimate username and password, the above SQL statement will return all the
records in sometable. More severe damage can be caused by attach the DML and DDL
queries.
8
Another form of code injection attack is by adding a second query to the original query.
For example, the attacker enters “’; drop table sometable --“ for password input, the
query becomes like the following:
SELECT * FROM users WHERE user=’someuser’ AND password=’ ‘; drop table
sometable -- ‘
After completing the first query, the database will execute the second query, which is to
drop sometable. Generally, any type of SQL query can be inserted. Therefore, this type of
attack is extremely harmful.
2.2.3 Fingerprinting and Enumeration
The attack usually starts with fingerprinting: gathering information of the target system.
The useful information includes type and version of database server, table schemas. Most
web applications display database errors to users and different types of database server
returns its unique error message. The attacker can use some crafted inputs to get the error
message, and gather database information.
2.2.4 Denial of Service Attack
SQL injection can be used to perform denial of service (DoS) attacks. A SQL injection
DoS is to overload a target system by submitting queries that would consume a large
amount of system resource so that the system is unable to provide normal services to the
user.
9
Chapter 3
DETECTION AND PREVENTION OF SQL INJECTION ATTACKS
Researchers have proposed different techniques to detect and prevent SQL injection
attacks. In this section, I summarize advantage and disadvantage associated with each
technique.
3.1 Defensive Coding
The most effective way to prevent SQL injection attacks is to apply defensive coding
practices.
3.1.1 Prepared Statement
Prepared statement (parameterized queries) force the code to firstly define all the user
input type, and then pass in each parameter to the query later. It ensures that an attacker is
not able to change the intent of a query. In the Java code example below in Figure 3,
string query contains a SQL query with question marks, which serve as placeholders for
the user input. User input for the bind variables is passed to the placeholders by setString
method. If an attack enters “someuser’ or ‘1=1’”, the parameterized query will look for a
username which literally matches the entire string and the return result will be failed.
10
String query = "SELECT * FROM users WHERE username=? AND password=?”;
PreparedStatement pstmt = conection.prepareStatement(query);
pstmt.setString(1, username);
pstmt.setString(2, password);
ResultSet resultset = pstmt.executeQuery();
Prepared Statement
Example
The example I showed is Figure
in Java.3 Practically
all common
programming languages
Although enforcing defensive coding practices is the best way to prevent SQL injection
attacks, it still cannot guarantee a flawless application code because human errors are
inevitable. Sometimes, developers forgot or did not perform adequate input validation
check. In addition, many legacy application codes exist; to patch all those systems can be
very tedious.
3.2 Detection and Prevention Techniques
Researchers have proposed different types of detection and prevention techniques for the
countermeasure of SQL injection attacks.
3.2.1 Static Analysis
Huang and colleagues [3] propose WAVES (Web Application Vulnerability and Error
Scanner), a black-box technique for testing web applications for SQL injection
vulnerabilities. It attacks the target system, monitors the application’s response and use
machine learning techniques to improve its attack. The drawback of this approach is that
it cannot provide guarantee of completeness.
11
Gould and colleagues [4, 5] proposed a technique called JDBC Checker. It performs
static analysis and verify the type correctness of dynamically generated SQL queries.
They use finite state automata to enforce type correctness. It is able to find type mismatch
errors in the web application. However, it would not find more general forms of SQL
injection attacks that generate syntactically and type correct queries.
3.2.2 Combine Static and Runtime Analysis
Halfond and Orso[ 6,7] proposed a tool called AMNESIA to detect SQL injection attacks
by combining static analysis and runtime monitoring techniques. At static time, they use
string analysis technique to identify the hotspots and build a SQL-query model for each
hotspot. At runtime, they check the dynamically generated queries against the SQL-query
model, reject and report queries that violate the model. However, the primary limitation
of this technique is that its success is depending on the accuracy of its static analysis for
building query models. In certain situations the tool can generate false positives and false
negatives.
Huang et al. [8] developed a tool called WebSSARI (Web application Security by Static
Analysis and Runtime Inspection), which detects input validation related errors using
information flow analysis. It uses a lattice-based static analysis algorithm. They check
taint flows against preconditions for sensitive functions at static time, and insert a runtime
guard to the points where preconditions have not been met. The filters and sanitization
12
functions are added to web application code to satisfy the precondition. The primary
drawback of WebSSARI is that they cannot find all the vulnerabilities in a system.
3.2.3 Runtime Analysis
SQLGuard [9] is a runtime analysis technique that uses parse tree validation to detect
SQL injection attacks. This approach detects the attack by comparing the tree structure of
the SQL query before and after the concatenation of user input. It uses a secret key to
wrap user input so this technique requires code changes. The key should be kept secret,
otherwise the attacker may bypass the check.
Boyd and Keromytis [10] proposed SQLrand. It is a technique based on instruction-set
randomization. Developers can create queries using randomized instructions instead of
normal SQL keywords. A proxy filter then de-randomizes the randomized query and
converts it to proper query for the database. The code injected by attacker would not have
been constructed using the randomized instruction set and would cause runtime
exceptions. The drawback of this technique is the complexity of configuration, and the
security of the approach is dependent on the security of the key.
3.2.4 Using Intrusion Detection System
Valeur and colleagues [11] propose the use of Intrusion Detection System (IDS) to detect
SQL injection attacks. It is an anomaly-based IDS based on machine learning technique.
In the training phase, it feeds a number of normal application queries into the machine-
13
learning algorithm, which generates models that can characterize the profiles of normal
usages. In the detection phase, the technique monitors the application to identify queries
that do not match the model. The system is able to detect attacks with a high rate of
success. However, a poor training set can cause the learning technique to generate a large
number of false positives and false negatives.
14
Chapter 4
IMPLEMENTATION
This chapter presents our approach to detect and prevent SQL injection attacks. The basic
idea is to insert a validation process between the generation of SQL query and the query
execution. This chapter covers the implementation details.
4.1 System Architecture
The system consists of two parts: static analysis of web application code and runtime
validity check of dynamically generated SQL query. The following four steps are
involved, where Steps 1, 2, and 3 are static time analysis, and Step 4 is run time analysis.
1. Identify hotspots: Parses the web application code to detect and locate the hotspots.
2. SQL query analyzer: Analyzes SQL query string for each hotspot.
3. Validation check Initialization: Initializes for validity check. It uses the table name
and attribute name retrieved from the second step, including applying hash function
on each attribute value, building encryptionT table and creating triggers on tables
being queried.
4. Runtime validation check: checks the validity of user input by applying hash function
on each user input and comparing the hash value with the value in encryptionT table.
The following subsections present the details of each step.
15
4.1.1 Identify Hotspots
The hotspot is the place where the code interacts with underlying database and the SQL
query is executed. In web applications implemented in JAVA, the typical hotspot can be
the place where any of the following method is implemented: execute; executeQuery;
executeUpdate. In the code example shown in Figure 4, the hotspot is at line 4. Only the
SQL query string with user input is further analyzed. When a hotspot is identified, we
add two methods to wrap the hotspot: SQL query analyzer and SQL query validation
check, as the code shown in Figure 5. We use the approach of [12] to implement hotspot
indentificaiton. A regular expression can be created to match all different souce code
forms of hotspot. Using this regular expression, a matcher is created to parse the web
application code to identify the hotspots.
public class SomeApp extends HttpServlet{
public ResultSet getUser(String username, String password)
{
1. Java.sql.Connection connection = DriverManager.getConnection( );
2. Java.sql.Statement statement = connection.createStatement();
3. String queryString = "SELECT * FROM user_tb WHERE username=" +
username
+ " AND password=" + password;
4. ResultSet resultSet = statement.executeQuery(queryString);
5. Return resultSet;
}
}
Figure 4 User Login Code in JAVA; Hotspot is Bolded
16
4a.
4b.
{
4c.
5.
QueryAnalyze(queryString);
if (QueryCheck.hashCompare())
ResultSet resultSet = statement.executeQuery(queryString);
Return resultSet;
}
Figure 5 Wrap the Hotspot with Additional Validation Check
4.1.2 SQL Query Analyzer
SQL query analyzer is to retrieve the table and attribute name from a SQL query string.
We built our SQL query analyzer using General SQL Parser [13]. General SQL Parser
consists of two main components: 1) Lex parser; 2) Yacc parser. SQL query string is
firstly tokenized into a list of tokens by the lex parser. Then based on the BNF of
different database dialects, Yacc parser converts source tokens to a parse tree. The top
node of the parse tree is the type of sql statement such as SelectStatement,
InsertStatement, DeleteStatement, UpdateStatement etc. The sub tree nodes include
columnlist, tablelist, where clause etc.
SQL Query Analyzer consists of two sub steps. Firstly, it takes a SQL query string as
input, parse it and write the result to an XML file. For example, the following SQL query
is converted to an XML file shown in Figure 6:
SELECT * FROM user_tb WHERE username=’panda’;
17
<sqlscript>
<TStatementList size='1'>
<TSelectSqlStatement setOperator='0'>
<TResultColumnList size='1'>
<TResultColumn>
<TExpression type='15'>
<TObjectName type='1'>
*</TObjectName>
</TExpression>
</TResultColumn>
</TResultColumnList>
<TJoinList size='1'>
<TJoin type='1'>
<TTable type='objectname'>
user_tb</TTable>
<TJoinItemList size='0'>
</TJoinItemList>
</TJoin>
</TJoinList>
<TWhereClause>
<TExpression type='40'>
<comparisonOperator>
=</comparisonOperator>
<TExpression type='15'>
<TObjectName type='1'>
username</TObjectName>
</TExpression>
<TExpression type='16'>
<TConstant>
'panda'</TConstant>
</TExpression>
</TExpression>
</TWhereClause>
</TSelectSqlStatement>
</TStatementList>
</sqlscript>
Figure 6 XML Parse Tree Generated by General SQL Parser
The advantage to convert the SQL query string to XML file is that this kind of conversion
works for any types of query, including SELECT, DELETE, INSERT AND UPDATE.
18
Moreover, it is able to retrieve joined tables in from clause. Then by parsing the
converted XML file, the analyzer returns the name of the table being queried and the
attribute names in the where clause. For the example shown above, the table name will be
‘user_tb’ and column names will be ‘username’.
For each hotspot, the table name and attribute values are recorded. They can be utilized to
do the initialization and runtime validity check steps.
4.1.3 Validation Check Initialization
Initialization process includes 1) create and initialize encryptionT table; 2) setup triggers
for original table. For example, user_tb shown in Table 2 has two columns, username and
password. The EncryptionT table shown in Table 2 consists of four columns:
table_name, attribute_name, attribute_value, and hashed_ value. Table_name keeps track
of the name of the table. Attribute_name keeps tracks of the name of the attribute.
Attribute_value stores the value of the corresponding attribute. Hashed_value stores the
hashing result of each attribute value in original user_tb table. The particular hash
function we use here will be discussed in Section 4.2.1.
Since user_tb can be modified at anytime, to reflect the most recent update in user_tb in
encryptionT table, triggers have to be created. The detailed implementation of the trigger
will be discussed in Section 4.2.4.
19
Table 1 user_tb table
username
password
panda
123
cat
456
Table 2 encryptionT table
table_
attribute_
attribute_
hashed_value
name
name
value
user_tb
username
panda
bf52d6e7c590c26743c917658fe7c6ee014725cf
user_tb
password
123
b14e92eb17f6b78ec5a205ee0e1ab220fb7f86d7
user_tb
username
cat
e778c7ffbb72f2c05d426d84b3aeaae0b3952105
user_tb
password
456
ab567f1ae9fcb23472379151a24705cbc106ea0e
4.1.4 Runtime Validation Check
Since we wrapped the SQL query execution (hotspot) with the query validation check
method, every time a hotspot occurs, only if our runtime validity check returns true, the
program continues executing, otherwise a warning message is given and the execution
halts. The runtime input validity checker takes the user input; apply the same hash
function used for initialization. Say the hashed value got at runtime is v1, using v1 to
query encryptionT table:
20
SELECT attribute_value as v_set FROM encryptionT WHERE hashed_value = v1
If the user input matches any value in v_set, it is considered as a valid input; the program
execution will continue. However if there is a malicious input, no value will be returned
from the above query; the program cannot pass the validation.
4.2 Implementation Discussion
In this section, the implementation details are discussed.
4.2.1 Hash Function
A Hash function is a mathematical function that maps a string of arbitrary length (up to a
pre-determined maximum size) to a fixed length string. Keyed-hash based message
authentication code (HMAC) is a message authentication code that uses a cryptographic
key in conjunction with a hash function. HMAC can be used with any iterative
cryptographic hash function. The cryptographic strength of HMAC depends on the
property of underlying hash function. Below is the definition for HMAC [14]:
Let:
H(·) be a cryptographic hash function
K be a secret key padded to the right with extra zeros to the input block size of the hash
function, or the hash of the original key if it's longer than that block size
m be the message to be authenticated
∥ denote concatenation
⊕ denote exclusive or (XOR)
21
opad be the outer padding (0x5c5c5c…5c5c, one-block-long hexadecimal constant)
ipad be the inner padding (0x363636…3636, one-block-long hexadecimal constant)
Then HMAC(K,m) is mathematically defined by
HMAC(K,m) = H((K ⊕ opad) ∥ H((K ⊕ ipad) ∥ m))
The size of the key for HMAC can be of any length up to 64 bytes, the block length of
the hash function. The keys longer than B bytes will first be hashed using H and then use
the result as the actual key to HMAC. The minimal recommended length for K is the
byte-length of hash outputs (20 bytes for SHA1 [14]). We use a 40 bytes key for the hash
function.
The code in Figure 7 is used for calculating hash value for an input string. Java
Cryptography Extention (JCE) is used to implement the HMAC function. First, a 40
bytes random alphanumeric string is generated by Apache commons RandomStringUtils
class, and then converted to a byte array. SecretKeySpec class can be used to construct a
secret key for a specified HMAC algorithm from the given byte array. Mac class
provides the functionality of Message Authentication Code algorithm. At last, the result
byte array is converted back to a string output.
22
public static void GetKey()
{
key = RandomStringUtils.randomAlphanumeric(40);
}
//HMAC
public static String CalHmac(String input, String aKey)
{
byte [] digest = null;
try
{
//Construct a secret key with a byte array
SecretKeySpec secret = new
SecretKeySpec(aKey.getBytes(),"HmacSHA1");
Mac mac;
//Get instance of Mac object implementing HMAC-SHA1 and
//initialize it with the secret key
mac = Mac.getInstance("HmacSHA1");
mac.init(secret);
digest = mac.doFinal(input.getBytes());
}
catch (NoSuchAlgorithmException e)
{
e.printStackTrace();
}
catch (InvalidKeyException e)
{
e.printStackTrace();
}
StringBuffer sb=new StringBuffer();
for (int i = 0; i < digest.length; i++)
{
String hex=Integer.toHexString(0xff & digest[i]);
if(hex.length()==1) sb.append('0');
sb.append(hex);
}
return sb.toString();
}
Figure 7 Java Code Implementing HMAC Function
23
4.2.2 Rolling Key
It is suggested by NSF [14] that the key used for HMAC function should be refreshed
periodically to limit the damage of an exposed key. There is no specific recommended
frequency for key changes. In my implementation, it is set to 24 hours. So every 24
hours, a new random key is generated. Consequently, the encryptionT table needs to be
updated, as well as the trigger. As shown in Figure 8, QueryCheck class extends
TimerTask class, the tasks need to be scheduled are implemented in TimerTask’s abstract
run method. Java scheduler is used to scheduling the task. The task can be scheduled to
perform starting at a particular date and at any time interval, for example, once a day at 3
am, as the code shown in Figure 9.
Import java.util.TimerTask;
public class QueryCheck extends TimerTask
{
public void run( )
{
con = getConnection( );
//generate a new key
GetKey();
//reset EncryptionT table
InitializeEncryptionT (con,tableName,columns);
//update three triggers
setTrigger(con, tableName, columns);
}
}
Figure 8 Java Code Implementing a Task to be Scheduled
24
import java.util.Calendar;
import java.util.Date;
import java.util.GregorianCalendar;
import java.util.Timer;
import java.util.TimerTask;
//schedule a task everyday at 3am, beginning at a specific date
public class ScheduleTask
{
private final static long PERIOD = 1000*60*60*24;
private final static int YEAR = 2011;
private final static int MONTH = 9;
private final static int DAY = 1;
private final static int HOUR = 3;
private final static int MINUTE = 0;
public static void main(String[ ] args)
{
TimerTask task = new SqlValidityCheck( );
Timer timer = new Timer( );
timer.scheduleAtFixedRate(task,
setDate(YEAR,MONTH,DAY,HOUR,MINUTE),
PERIOD);
}
//construct a GregorianCalendar with the given date and time
private static Date setDate(int year, int month, int day, int hour, int minute)
{
Calendar result = new
GregorianCalendar(year,month,day,hour,minute);
return result.getTime();
}
}
Figure 9 Java Code Implementing Scheduling a Task
25
4.2.3 Building Index for encryptionT Table
If there are many related tables and attributes, the encryptionT table can be very large. To
enhance the search performance, we build an index using table_name and attribute_name
for encryptionT table as shown in Figure 10.
CREATE TABLE EncryptionT (table_name varchar(15),
attribute_name varchar(15),
attribute_value varchar(15),
hashvalue varchar(40),
index (table_name, attribute_name);
Figure 10 Create encryptionT table with index
The query in Error! Reference source not found.can be used to query encryptionT
able.
SELECT
attribute_value
FROM encryptionT
WHERE hash_value = v1 AND table_name =’user_tb’ AND attribute_name =’
username’;
Figure 11 Query encryptionT Table with Index
4.2.4 Trigger
Since the database content of a web application is not static, it can be edited dynamically.
For example, a new user can be added, deleted or modified in the database. Three triggers
26
have been built to reflect the most recent updates the content of encryptionT in case the
original table was updated. Use user_tb table as an example. Three triggers have been
created on user_tb: insertTrigger, deleteTrigger and updateTrigger.
After a new row is inserted in user_tb, insertTrigger apples the HMAC function to the
inserted attribute value and insert two new rows to encryptionT table, one for each
attribute in user_tb: username and password . The code to create insertTrigger is shown
in Figure 12.
DROP TRIGGER IF EXISTS insertTrigger;
CREATE TRIGGER insertTrigger
AFTER INSERT ON user_tb
FOR EACH ROW
BEGIN
INSERT INTO EncryptionT SET
table_name = 'user_tb',
attribute_name = 'username',
attribute_value = NEW.username,
hashvalue = HMACSHA1(key, NEW.username);
INSERT INTO EncryptionT SET
table_name = 'user_tb',
attribute_name = 'password',
attribute_value = NEW.password,
hashvalue = HMACSHA1(key, NEW.password);
END
Figure 12 Create a insertTrigger on user_tb
Figure 13 shows the code to create a deleteTrigger. It deletes all the records that are
relate to the rows being deleted in the user_tb table.
27
updateTrigger applies the HMAC function to the updated attribute value and updates all
the records which relate to the rows being updated in the user_tb table. The code is
shown in Figure 14.
DROP TRIGGER IF EXISTS deleteTrigger;
CREATE TRIGGER deleteTrigger
AFTER DELETE ON user_tb
FOR EACH ROW
BEGIN
DELETE FROM EncryptionT
WHERE attribute_name = 'username'
AND attribute_value = OLD.username;
DELETE FROM EncryptionT
WHERE attribute_name = 'password' AND
attribute_value = OLD.password;
END
Figure 13 Create a deleteTrigger on user_tb
DROP TRIGGER IF EXISTS updateTrigger;
CREATE TRIGGER updateTrigger
AFTER UPDATE ON user_tb
FOR EACH ROW
BEGIN
UPDATE EncryptionT
SET attribute_value = NEW.username,
hashvalue = HMACSHA1(key,NEW.username)
WHERE attribute_name = 'username'AND
attribute_value = OLD.username;
UPDATE EncryptionT
SET attribute_value = NEW.password,
hashvalue = HMACSHA1(key,NEW.password)
WHERE attribute_name = 'password' AND
attribute_value = OLD.password;"
END
Figure 14 Create a updateTrigger on user_tb
28
4.3 Performance Evaluation
The evaluation was performed on Intel Core i7 CPU with 4 GB RAM, 64 bit Windows 7
Enterprise operating system. The database server is MySQL 5.1, which is installed on the
same machine as the application server.
The proposed technique introduces some overhead at runtime. The overhead mainly
results from querying encryptionT table. For evaluation purpose, the user_tb table is
populated with records of 100, 250, 750, 2500, 5000, 25000 rows.. The table schema is
shown in table 1. Accordingly, encryptionT table has records of 200, 500, 1500, 5000,
10000, 50000 rows. The overhead is measured by executing the query in Figure 10 to
query the hashed value in encryptionT table. The result is shown in Figure 15.
Our approach
Execution Time (ms)
300
250
200
150
100
50
0
200
500
1500
5000
10000
Number of Rows in encryptionT table
Figure 15 Runtime Response Time
50000
29
The time overhead of proposed approach is less than 250 milliseconds even for large set
of records. It has no significance impact on the performance of web application.
30
Chapter 5
CONCLUSION
In this project, different types of SQL injection attacks were examined, and the
countermeasure techniques proposed by researchers were surveyed. Based on the study,
this project presented a new technique to prevent and detect SQL injection attack. The
technique combines static analysis of web application code and runtime input validation
check. At static time, after identifying the code segments where web application code
interacts with underlying database, it analyzes the SQL query string. The retrieved table
name and attribute name are used to build encryptionT table, which stores related
attribute value and hashed value. At run time, same hash function is applied on the user
input. Only if both hashed value and user input match the data stored in encryptionT
table, the execution of web application can be resumed. The advantage of this approach is
that the existing databases tables used in the web application are not affected. The
implementation details were described in the report.
There are a few improvements that can be done in the future. Currently, the system was
implemented in Java. It is applicable to Java-based web application with JDBC API. In
the future, more generalized system, which work for Java persistent API, .NET and PHP
based web application, can be developed. Moreover, extensive and realistic evaluation
needs to be performed to test the system.
31
APPENDIX
Source Code
Base.java
import java.sql.*;
import org.apache.commons.lang.*;
import java.security.*;
import javax.crypto.*;
import javax.crypto.spec.SecretKeySpec;
/**
* Class Base handles database connection and HMAC calculation
* @author Xiaoying Shen
*
*/
public class Base
{
//Connect to db
public static Connection getConnection(){
Connection con = null;
String url = "jdbc:mysql://localhost/oo";
String username = "sa";
String password = "B33f34t3r";
try
{
Class.forName("com.mysql.jdbc.Driver");
con = DriverManager.getConnection(url, username, password);
System.out.println("Connect to MySql.");
}
catch (Exception e){
e.printStackTrace();
}
return con;
}
//Close db connection
public static void closeConnection(Connection con)
{
if(con!= null){
try {
con.close();
}
catch (SQLException e) {
32
e.printStackTrace();
}
}
}
//initiate appuser table, populate the table with specified rows of data
public static void InitiateUserTable(Connection con, int rowNum)
{
ResultSet rs = null;
String user = "";
String pass = "";
try
{
for(int i = 0; i <rowNum; i++)
{
//randomly generate username and password
user = RandomStringUtils.randomAlphabetic(6);
pass = RandomStringUtils.randomAlphanumeric(6);
String query = "insert into user_tb values('"+ user+"','" +pass+"')";
Statement stmt = con.createStatement();
stmt.executeUpdate(query);
}
}
catch (Exception e){
e.printStackTrace();
}
}
//HMAC
public static String CalHmac(String input, String aKey)
{
byte [] digest = null;
try
{
SecretKeySpec secret = new SecretKeySpec(aKey.getBytes(),"HmacSHA1");
Mac mac;
mac = Mac.getInstance("HmacSHA1");
mac.init(secret);
digest = mac.doFinal(input.getBytes());
}
catch (NoSuchAlgorithmException e)
{
e.printStackTrace();
}
33
catch (InvalidKeyException e)
{
e.printStackTrace();
}
StringBuffer sb=new StringBuffer();
for (int i = 0; i < digest.length; i++) {
String hex=Integer.toHexString(0xff & digest[i]);
if(hex.length()==1) sb.append('0');
sb.append(hex);
}
return sb.toString();
}
}
QueryCheck.java
import java.sql.Connection;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.Statement;
import java.util.TimerTask;
import org.apache.commons.lang.RandomStringUtils;
/**
* Class QueryCheck initializes encryptionT table, creates trigger, and performs run time user
input validation check.
* @author Xiaoying Shen
*
*/
public class QueryCheck extends TimerTask{
static String key = "";
public static void main(String[] args)
{
Connection con = null;
String tableName = "user_tb";
String[] columns = {"username", "password"};
String attribute = "username";
String userInput = "'1=1--";
try
{
34
con = Base.getConnection();
//Base.InitiateUserTable(con, 100);
//Initialization
GetKey();
InitializeEncryptionT(con, tableName, columns);
setTrigger(con, tableName, columns);
//runtime check
System.out.println(hashCompare(con,tableName, attribute, userInput));
}
catch (Exception e)
{
e.printStackTrace();
}
finally
{
Base.closeConnection(con);
}
}
public static void GetKey()
{
key = RandomStringUtils.randomAlphanumeric(40);
}
//read the tuples in user_tb, calculate hash value
//and write the input and hashValue to encryptionT
public static void InitializeEncryptionT(Connection con, String tableName, String[] attributes)
{
//Connection con = null;
ResultSet rs = null;
String attribute = null;
String input;
String hashValue;
String query = null;
Statement stmt = null;
try{
stmt = con.createStatement();
query="drop table if exists EncryptionT";
stmt.executeUpdate(query);
query = "CREATE TABLE EncryptionT(table_name varchar(15), " +
"attribute_name varchar(15), attribute_value varchar(15),
hashvalue varchar(40),index(table_name, attribute_name))";
stmt.executeUpdate(query);
query = "select * from " + tableName;
rs = stmt.executeQuery(query);
35
while(rs.next()){
for(int i = 0; i < attributes.length; i++)
{
attribute = attributes[i];
input = rs.getString(i+1);
System.out.println(input);
hashValue = Base.CalHmac(input, key);
writeToDb(con, tableName,attribute, input, hashValue);
}
}
}
catch (Exception e){
e.printStackTrace();
}
}
//write the hashed input into EncryptionT table
private static void writeToDb (Connection con, String tableName, String attribute, String input,
String hashInput){
try{
String query = "insert into EncryptionT values('"+ tableName +"','"+ attribute+"','" +
input+ "','"+hashInput+"')";
//System.out.println(query);
Statement stmt = con.createStatement();
stmt.executeUpdate(query);
}
catch(Exception e){
e.printStackTrace();
}
}
//create insertTrigger, deleteTrigger and updateTrigger
public static void setTrigger(Connection con, String tableName, String [] attributes)
{
try
{
Statement stmt = con.createStatement();
//insert trigger
String dropQuery = "DROP TRIGGER IF EXISTS insertTrigger";
stmt.executeUpdate(dropQuery);
String insertQuery = "";
for (int i = 0; i < attributes.length; i++)
{
insertQuery += "INSERT INTO EncryptionT SET table_name = '"+
tableName+"', "+ "attribute_name = '"+ attributes[i] +"',
"+"attribute_value = NEW."+attributes[i]+ ", “hashvalue
= HMACSHA1('"+ key + "',"+"New."+attributes[i]+");";
36
}
String insertTriggerQuery =
"CREATE TRIGGER insertTrigger " +
"AFTER INSERT ON " + tableName +
" FOR EACH ROW" +
" BEGIN " + insertQuery+
" END";
stmt.executeUpdate(insertTriggerQuery);
//delete trigger
dropQuery = "DROP TRIGGER IF EXISTS deleteTrigger";
stmt.executeUpdate(dropQuery);
String deleteQuery = "";
for (int i = 0; i < attributes.length; i++)
{
deleteQuery += "DELETE FROM EncryptionT WHERE table_name = '"
+ tableName + "' and "+ "attribute_name = '"+
attributes[i] + "' and "+"attribute_value = OLD."+
attributes[i] + " ;";
}
String deleteTriggerQuery =
"CREATE TRIGGER deleteTrigger "+
"AFTER DELETE ON "+ tableName +
" FOR EACH ROW"+
" BEGIN " + deleteQuery +
" END";
stmt.executeUpdate(deleteTriggerQuery);
//update trigger
dropQuery = "DROP TRIGGER IF EXISTS updateTrigger";
stmt.executeUpdate(dropQuery);
String updateQuery = "";
for (int i = 0; i < attributes.length; i++)
{
updateQuery += "UPDATE EncryptionT SET attribute_value =
NEW."+attributes[i]+", "+ "hashvalue =
HMACSHA1('"+ key + "',"+"New."+attributes[i]+")
"+"WHERE table_name = '" + tableName + "' and
"+"attribute_name = '"+ attributes[i] + "' and
"+"attribute_value = OLD."+ attributes[i] + " ;";
}
String updateTriggerQuery =
"CREATE TRIGGER updateTrigger " +
"AFTER UPDATE ON " + tableName +
" FOR EACH ROW" +
" BEGIN " + updateQuery +
37
" END";
stmt.executeUpdate(updateTriggerQuery);
}
catch (Exception e){
e.printStackTrace();
}
}
//when there's a user input in web app, calculate the hash value for the input first, then try to find
//the same hashValue in EncryptionT table, if there's a match, check if the attribute and user
input value also match.
public static boolean hashCompare(Connection con, String tableInput, String attriInput, String
userInput){
boolean result = false;
ResultSet rs = null;
String input = null;
//calculate hash value for user input
try{
String hashedInput = Base.CalHmac(userInput, key);
String query = "select attribute_value from EncryptionT where hashvalue=? and
table_name=? and attribute_name=?";
PreparedStatement stmt = con.prepareStatement(query);
stmt.setString(1, hashedInput);
stmt.setString(2, tableInput);
stmt.setString(3, attriInput);
rs = stmt.executeQuery();
while(rs.next()){
input = rs.getString(1);
//compare user input with return result from query
if (input.equals(userInput))
{
result = true;
}
}
}
catch (Exception e){
e.printStackTrace();
}
return result;
}
@Override
38
public void run()
{
Connection con = Base.getConnection();
//generate a new key
GetKey();
//initialize EncryptionT table
String[] columns = {"username","password"};
InitializeEncryptionT(con,"user_tb",columns);
System.out.println("encryptionT table updated.");
//update three triggers
setTrigger(con, "user_tb", columns);
System.out.println("triggers updated.");
Base.closeConnection(con);
}
}
ScheduleTask.java
import java.util.Calendar;
import java.util.Date;
import java.util.GregorianCalendar;
import java.util.Timer;
import java.util.TimerTask;
/**
* schedule a task everyday at 3am, beginning at a specific date
* @author Xiaoying Shen
*
*/
public class ScheduleTask
{
private final static long PERIOD = 1000*60*60*24;
private final static int YEAR = 2011;
//month is 0 based, January is 0
private final static int MONTH = 10;
private final static int DAY = 8;
private final static int HOUR = 3;
private final static int MINUTE = 0;
public static void main(String[] args)
{
System.out.println("About to schedule task.");
39
TimerTask task = new QueryCheck();
Timer timer = new Timer();
timer.scheduleAtFixedRate(task, setDate(YEAR,MONTH,DAY,HOUR,MINUTE),
PERIOD);
//timer.schedule(task, setDate(YEAR, MONTH, DAY, HOUR, MINUTE));
System.out.println("Task scheduled.");
}
//construct a GregorianCalendar with the given date and time
private static Date setDate(int year, int month, int day, int hour, int minute)
{
Calendar result = new GregorianCalendar(year,month,day,hour,minute);
return result.getTime();
}
}
40
BIBLIOGRAPHY
[1] Web Application Security Statistics, [Online]
Available: http://www.webappsec.org/projects/statistics/
[2] W. G. Halfond, J. Viegas, and A. Orso, “A Classification of SQL-Injection Attacks
and Countermeasures,” In Proceedings of the Intern. Symposium on Secure Software
Engineering (ISSSE 2006), Mar. 2006.
[3] Y. Huang, S. Huang, T. Lin, and C. Tsai. Web Application Security Assessment by
Fault Injection and Behavior Monitoring. In Proceedings of the 11th International
World Wide Web Conference (WWW 03), May 2003.
[4] C. Gould, Z. Su, and P. Devanbu. JDBC Checker: A Static Analysis Tool for
SQL/JDBC Applications. In Proceedings of the 26th International Conference on
Software Engineering (ICSE 04) –Formal Demos, pages 697–698, 2004.
[5] C. Gould, Z. Su, and P. Devanbu. Static Checking of Dynamically Generated Queries
in Database Applications. In Proceedings of the 26th International Conference on
Software Engineering (ICSE 04), pages 645–654, 2004.
[6] W. G. Halfond and A. Orso. AMNESIA: Analysis and Monitoring for NEutralizing
SQL-Injection Attacks. In Proceedings of the IEEE and ACM International
Conference on Automated SoftwareEngineering (ASE 2005), Long Beach, CA, USA,
Nov 2005.
[7] W. G. Halfond and A. Orso. Combining Static Analysis and Runtime Monitoring to
Counter SQL-Injection Attacks. In Proceedings of the Third International ICSE
41
Workshop on Dynamic Analysis (WODA 2005), pages 22–28, St. Louis, MO, USA,
May 2005.
[8] Y. Huang, F. Yu, C. Hang, C. H. Tsai, D. T. Lee, and S. Y. Kuo. Securing Web
Application Code by Static Analysis and Runtime Protection. In Proceedings of the
12th International World Wide Web Conference (WWW 04), May 2004.
[9] G. T. Buehrer, B. W. Weide, and P. A. G. Sivilotti. Using Parse Tree Validation to
Prevent SQL Injection Attacks. In International Workshop on Software Engineering
and Middleware (SEM), 2005.
[10] S. W. Boyd and A. D. Keromytis. SQLrand: Preventing SQL Injection Attacks. In
Proceedings of the 2nd Applied Cryptography and Network Security (ACNS)
Conference, pages 292–302, June 2004.
[11] F. Valeur, D. Mutz, and G. Vigna. A Learning-Based Approach to the Detection of
SQL Attacks. In Proceedings of the Conference on Detection of Intrusions and
Malware and Vulnerability Assessment (DIMVA), Vienna, Austria, July 2005.
[12] Smith, B. and Williams, L. Using SQL Hotspots in a Prioritization Heuristic for
Detecting Web Application Vulnerabilities. International Conference on Software
Testing, Verification, and Validation (ICST), Berlin, pp. 220-229, 2011.
[13] General SQL Parser Documents, [Online]
http://www.sqlparser.com/document.php
[14] HMAC: Keyed-Hashing for Message Authentication, [Online]
Available: http://www.ietf.org/rfc/rfc2104.txt
Download