The Essence of Command Injection Attacks in Web Applications Zhendong Su and Gary Wassermann Present by Alon Kremer April 2011 Outline Command injection attacks in web application Formal definition of web application Formal Definition of command injection attack An algorithm to prevent those attacks 06:05 2 Attacking the Web Application Web application: ◦ takes input strings from the user and interprets it. ◦ Interacts with back-end database. ◦ Retrieve data and dynamically generates new content. ◦ Presents the output to the user. The threat – Command Injection Attack: ◦ Unexpected input may cause problems. 06:05 3 Web Application Architecture Application generates query based on user input Application Database Web browser 06:05 4 SQLCIAs - Example String query = “SELECT cardnum FROM accounts WHERE username = ‘” + strUName + “’ AND cardtype = ” + strCType + “;”; Expected input: SELECT cardnum FROM accounts WHERE username = ‘John’ AND cardtype = 2; Result: Returns John’s saved credit card number. 06:05 5 SQLCIAs - Example String query = “SELECT cardnum FROM accounts WHERE username = ‘” + strUName + “’ AND cardtype = ” + strCType + “;”; Malicious input: SELECT cardnum FROM accounts WHERE username = ‘John’ AND cardtype = 2 OR 1 = 1; ( ) ( ) Result: Returns all saved credit card numbers. 06:05 6 Web Application – Formally A function from n-tuples of input strings to queries strings. It doesn’t check changes in the query structure or gives information about the source of the strings. h “John”, “2” i “SELECT cardnum FROM ccards WHERE name = ‘John’ AND cardtype = 2” 06:05 7 Quick Overview Many web applications are vulnerable and lots of private records can be exposed in 1 attack. Ways to regulate user inputs ◦ ◦ ◦ ◦ Filter out “bad” strings. (‘O’brian’ ?) Escape quotes. ( 2 OR 1=1 ?) Limiting input’s length. Regular expression, etc. The cause of problems is that the input changes the syntactic structure of whole query. 06:05 8 SQLCIAs – Informally 06:05 9 SQLCIAs – Informally SQLCIA – modifies syntactic structure of a query. Our goal is to track user inputs with metadata: m and n so the input is syntactically confined in the augmented query. Modify SQL grammar to include metadata: nonterm ::= m symbol n Attempt to parse augmented query ◦ Fails ) block; Succeeds ) allow. 06:05 10 Valid Syntactic Forms Given G = {V, , S, P}, choose policy of input we want to allow U µ V [ VSF idea is that the parse tree has a node in U which has an input substring as descendants. b_term ::= b_term AND cond cond ::= val comp val val ::= num | id comp ::= < | > | = … 3 < x U = { cond } 2 OR 1 = 1 06:05 11 SQLCIAs – Formally Query q is a SQLCIA if ◦ q has a parse tree Tq . ◦ For some filter f and some input i: ◦ f(i) is a substring in q and is not a VSF in Tq . 06:05 12 Augmented Query Our goal is to track and identify the user input inside the query (in the parse tree). By augmenting the input to mikn we can determine which substrings of the constructed query come from the input. A query qa is an augmented query if it was generated from augmented input. qa =W(mi1n,…,minn) 06:05 13 Augmented Grammar Given: G = {V, , S, P} and U µ [ V An augmented query qa is in L(Ga) iff ◦ q is in L(G), and ◦ for each substring S that separates a pair of matching m,n, if the meta-characters are removed then S is VSF. Ga = {V [ {ua | u 2 U}, [ {m,n}, S, Pa} ua : fresh non-terminal Pa = {v ! rhsa | v ! rhs 2 P} [ {ua ! u | u 2 U} [ {ua ! mun | u 2 U} 06:05 14 Augmented Grammar {v ! rhsa | v ! rhs 2 P} construct production rules that all “Right Hand Side” occurrences of u 2 U are replaced with ua Example: S ::= bCD C ::= c P = D ::= d | dd S ::= baCDa ba ::= mbn | b C ::= c a P = Da ::= mDn | D D ::= d | dd U = { b, D } 06:05 15 Theorem For all i1,…,in, W(mi1n,…,minn) = qa 2 L(Ga) iff W(i1,…,in) = q 2 L(G) and q is not an SQLCIA. 06:05 16 Implementation Meta Characters- two random four letters strings, except dictionary words. Total of 264 72, 421 384,555 Most user inputs are dictionary words, passwords with numbers or other then 4 letters, so the probability for using the metacharacters is 0.000052 The policy U is defined in terms of which non terminals in SQL grammar are permitted to be at the root of VSF. 06:05 17 Implementation G SQL grammar G’ augment U Augmented SQL grammar Parser Generator SQLCheck Policy •use … randomly generatedbool strings bool ::= terma Web Browser bool terma ::= term terma terma | mtermn a 2 L(Ga) m SQLCheck returns q if q n a term ::= fac term term faca ::= fac | mfacn m n faca faca … Database Application 06:05 m fac n m fac n 18 Test Subjects Subject Description LOC PHP JSP Query Checks Added Query Sites Employee Directory Online employee directory 2,801 3,114 5 16 Events Event tracking system 2,819 3,894 7 20 Classifieds Online management system for classifieds 5,540 5,819 10 41 Portal Portal for a club 8,745 8,870 13 42 Bookstore Online bookstore 9,224 9,649 18 56 • Two languages (PHP & JSP): – Most techniques require a language-specific front-end; ours does not 06:05 19 Evaluation Language PHP Subject Timing (ms) Legitimate Attacks Mean Std Dev (Attempted / Allowed) (Attempted / Prevented) Employee Directory 660 / 660 3937 / 3937 3.230 2.080 Events 900 / 900 3605 / 3605 2.613 0.961 Classifieds 576 / 576 3724 / 3724 2.478 1.049 1080 / 1080 3685 / 3685 3.788 3.233 Bookstore 608 / 608 3473 / 3473 2.806 1.625 Employee Directory 660 / 660 3937 / 3937 3.186 0.652 Events 900 / 900 3605 / 3605 3.368 0.710 Classifieds 576 / 576 3724 / 3724 3.134 0.548 1080 / 1080 3685 / 3685 3.063 0.441 608 / 608 3473 / 3473 2.897 0.257 Portal JSP Queries Portal Bookstore RTT over internet: ~80-100ms 06:05 20 Conclusions Formal definition of SQLCIAs and an algorithm to prevent them by syntactically constrain substrings from user input. SqlCheck intercepts all queries and check their syntactic form. Suitable for different languages and web interfaces. 06:05 21 Future Work Experiment with more real-world online web applications and more sophisticated testing techniques. (input place holder). Apply to XSS, Xpath injection, etc. 06:05 22 A few thoughts about the article The formal definition of the web application and the SQLCIA referred to the most common and basic properties. The algorithm was simple and elegant. This solution suits for all web apps even in different programming languages. Easy to control the input policy. The evaluation was not tested versus attackers attempting to defeat this particular mechanism. 06:05 23