SEC835 Prevent SQL and command injection. Prevent XSS Introduction Command injection vulnerabilities require minimal resources to exploit. They have been exploited to affect the privacy of information, the integrity of data, and the availability of a service to other users. They can be effectively mitigated with simple development techniques. How it happens A programmer attempts to insert user input as a parameter into a dynamically crafted command line or query statement. If the implementation fails to filter metacharacters out, they will be understood by the query parser or command shell, or by interpreter, and executed This way, a hacker can inject a malicious code and execute it Example of special characters On Unix-like machines Semicolon (ends a statement) Backtic (data between backtics gets executed as a code) Vertical bar (everything after the bar is treated as another related process) If an attacker manages to add them, he can execute an arbitrary code Normal Command Parsing Sequence Attack Command Parsing Sequence Command injection technique Command line may be injected through: Data input http parameter Buffer overflow Format-string parameter SQL injection attack Is a kind of command injection SQL injection technique An attacker could extend the query statement to include records or columns from other tables not originally intended by the developer to be exposed This extension could append its information to the existing output, or overwrite it Parsing engine breaks statements into individual tokens for processing, looking for meta-characters The statement has been changed and executed SQL SQL uses single and double quotes to switch between data and code. Semi-colons separate SQL statements Example query: "UPDATE users SET prefcolour='red' WHERE uid='joe';" This command could be sent from a web front-end to a database engine. The database engine then interprets the command Dynamic SQL Generation Web applications dynamically generate the necessary database commands by manipulating strings Example query generation: myQuery = "UPDATE users SET prefcolor=‘" + colour + "' WHERE uid=‘" + authUserId + "'"; Where the value of “colour" would be originating from the client web browser, through the web server. And where the value for “authUserId" would have been stored on the server and verified through some authentication scheme Client Web Browser Forms in client browsers return values to the web server through either the POST or GET methods • "GET" results in a url with a "?" before the values of the form variables are specified: • http://www.example.com/script?colour=red • The value of “color" is set to "red" in the script "GET" urls are convenient to hack, but there isn't any significant difference in the security of either "GET" or "POST" methods because the data comes from the client web browser regardless and is under the control of the remote attacker The SQL Table Tables are used to store information in fields (columns) in relation to a key (e.g., "uid") What other fields could be of interest? CREATE TABLE users ( prefcolour varchar(20), uid VARCHAR(20) NOT NULL, privilege ENUM('normal', 'administrator'), PRIMARY KEY (uid) ); A Malicious SQL Query What if we could make the web server generate a query like: "UPDATE users SET prefcolour='red', privilege='administrator' WHERE uid='joe';“ Can we engineer the value of "colour" given to the web server so it generates this query? • Note how code and data are mixed in the same channel Malicious HTTP Request http://www.example.com/script?colour=red',privileg e='administrator The "color" input is then substituted to generate SQL: sqlStatement.execute("UPDATE users SET prefcolor='" + colour + "' WHERE uid='" + authUserId + "'" ); It gives the query we wanted! sqlStatement.execute( "UPDATE users SET prefcolour='red', privilege='administrator' WHERE uid='joe'" ); Results Joe now has administrator privileges. Adding Another SQL Query Let's say Joe wants to run a completely different query: "DELETE FROM users" This will delete all entries in the table! How can the value of "colour" be engineered? Malicious HTTP Request http://www.example.com/script?colour=red '%3Bdelete+from+users%3B %3B is the url encoding for ";" What happens when the "colour" input is used to generate SQL? sqlStatement.execute( "UPDATE users SET prefcolor='" + colour + "' WHERE uid='" + authUserId + "'"; ); Result UPDATE users SET prefcolor='red'; delete from users; WHERE uid='joe' The last line generates an error, but it's already too late; all entries have been deleted. The middle query could have been anything (not just delete, how about return all credit card info) Another SQL injection example If an SQL statement such as “SELECT * FROM TestTable WHERE Id = <var>” where to substitute <var> (a user-supplied input field) with a string such as “2;SHUTDOWN;”, the resulting statement (SELECT * FROM TestTable WHERE Id = 2;SHUTDOWN;) becomes two statements: the first, a standard database query, and the second, a command to shut down the database server. What is the root problem? The application does not validate data input, or validate it improperly Example The appearance of a certain number of spaces has been treated as the end of data input. However, an attacker consciously put the command line at the end of long string It is especially dangerous when The length of data input is not limited Data format is not defined Any characters are allowed Improper data validation Problems Data validation does not look at the special characters Data validation routine is not specific enough Data validation components can be bypassed Compounding Factors There are two additional factors besides unfiltered input that significantly contribute to the impact of Command Injection Vulnerabilities: Level of Privilege • An interpreter should not be assigned high level of access privileges, however it happens often (administrator or root access given). The high level of privileges aggravates the harmfulness of an injected code Error Handling • Error messages may help an attacker by providing technical details A Variety of Injection Attacks SQL & LDAP Injection Attacks on Databases Shell Command Injection Attacks on the Operating System Scripting Language Injection Attacks on the Application Platform Cross-Site Script Injection Attacks on Clients/Browsers XPath Injection Attacks on Processing Engines Command injection example An application provides an interface to display a directory listing based on a user-supplied string (intended to be the name of a directory on the system). The developer chose to exploit the system exec() function and pass it the “dir” command along with the appended user-supplied directory name. If the directory name were the command “del c:\winnt\explorer.exe” the resulting command (dir;del c:\winnt\explorer.exe) might successfully erase the system’s shell. (Depending on the user’s level of privilege.) Script Injection Example Some scripting languages (i.e. JavaScript, PHP) offer the ability to evaluate a string of text as script code, allowing an application to execute dynamic (and potentially arbitrary) code segments. If user-supplied input is passed to such a function, an attacker gains the ability to run whatever code he or she pleases. exercise Article re SQL injection http://www.owasp.org/index.php/Reviewing _Code_for_SQL_Injection Comment your spreadsheet cell K3 Friday Lab task Work on the spreadsheet re Vulnerability Cells K3 and H3 – put your comments What are the venues leading to command and SQL injection Read the article http://www.owasp.org/index.php/Improper_ Data_Validation Friday Protection Data input validation Be especially careful about special characters Be aware about related techniques Favour API calls versus calling commands via shell (to avoid interpreters) Protection Against SQL and command line injections Validation of Input and Output Use strict, customized validation to detect any meta-characters that might be recognized by an invoked interpreter. The validation function should block any malicious inputs outright. As another step in security, output validation can be used to determine if the values being sent back to the user are sane. (i.e. checking return values) Input and Output Filtering In addition to (or as an alternative to) validation of inputs and outputs, filtering functions may be used to strip out disallowed characters and return only the remainder, or neutralizes dangerous characters through encoding techniques Sample Meta-character Table MetaCharacter General Purpose “ (doublequote) Used to “wrap” an input, defining the boundaries of a token or parameter. i.e. in scripting environments (JSP, PHP, or various shells) they start and end tokens comprised of multiple words [with spaces between them]. Used to ‘wrap’ an input, defining the boundaries of a token or parameter. i.e. SQL statements: SELECT * FROM wherelanguages, SomeValue =backticks ‘foo’ In certain shell environments andtblTest scripting ‘ (single-quote) ` (backtick) : (colon) indicate a section of the string meant to be treated as system commands. Multiple uses, including a delimiter in tokenized strings, a URI syntax operator, a shell command for devices or drives, etc. ; (semi-colon) Often indicates the end of a statement, followed by a new statement. i.e. cd /; rm –rf *; echo ‘oops’; | (pipe) Used to redirect output from one statement from standard output to a new destination. (i.e. cat /etc/passwd | grep ‘root’) \ (backslash) Multiple uses, including an escape meta-character which might foil lesser escaping mechanisms; also used as a delimiter for path navigation. Meta-character table The table must be customized to recognize the unique meta-characters for each parsing engine that will receive statements carrying user inputs. Data Validation - White List Policy Validate data based on White-List policy You have a better chance to enforce the policy rules when you state explicitly what is allowed vs. a huge range of not allowed but unknown values The application remains safe when you extends the white list Black list can guard only against known negative cases White-list limitation – use of extended character sets (International languages) Data validation checklist Always perform data-input validation on the server, even if it was already performed on the client. Design specific data-input validation procedures that match the business rules. Disallow unlimited data strings. Validate the data type and format. Validate data completeness and consistency. Validate the use of special characters. Data validation checklist (cont) Block any invalidated or incomplete pieces of data and never try to interpret them. They must be returned to the user. Use a white-list policy to validate data and not a black-list policy. Validate an XML document against a DTD and XML Schema. Validate data at least twice: as close to an originator as possible, and immediately before usage. Database Engines: Stored Procedures and Views If the database engine supports them, use stored procedures and/or views when creating queries. Most commercial database engines restrict or negate attempts to modify the parameter list of a stored procedure, or to append additional commands when invoking them. Implement additional level of validation built in for each parameter Remove certain dangerous stored procedures (i.e. xp_cmdshell in SQL Server) – that is the DBA task Database Engines: Stored Procedures and Views Using the SQL “View”, employ them for queries and pass only limited parameters to narrow the result set Disallow OLE-DB adhoc queries. It can be achieved for SQL Server by modifying the registry as follows HKEY_LOCAL_MACHINES\Software\Microso ft\MSSQLServer\ MSSQL\DisAllowAdhocAccess Database Engines: Parameterized Statements Avoid using dynamically crafted statements since a SQL statement, assembled dynamically as a string, offers no protection for proper syntax. // establish connection // build a query using the supplied employee id String sSQLQuery = “SELECT Salary, Benefits FROM Employees WHERE ID = ‘“ + sEmployeeID + “’”; // run the query // Note: Here, if sEmployeeID were set to a string such as: “x’ OR ‘t’ = ‘t”, an employee could see all employee records If absolutely necessary, use static or parameterized statements for code-level crafted queries, and enforce strict syntax and neutralize special characters. (Example on the following slide.) Parameterized Statements Continued // establish connection java.sql.PreparedStatement psQuery = dbConnection.prepareStatement( "SELECT Salary, Benefits FROM Employees WHERE ID = ‘?’" ); psQuery.setInt( 1, sEmployeeID ); // run the query // manage the result set // disconnect // Note: Here, the format of the query string is more strictly enforced, and the range of possible values for the id limited to integers Summary: What to Do Enforce proper data input validation Use stored procedures, prepared statements, views Encode meta-characters in data output Do that for all sorts of data input/output: GUI http parameters Data files sent as a bulk input Reduce error message verbosity Minimize the access privileges for interpreters and parsers Enforce logging and auditing mechanisms Combat social engineering Summary: What NOT to do! Do not rely on a language’s generic metacharacter filter. Avoid executing shell commands that include untrusted inputs. Avoid dynamically created query statements in favour of prepared statements, stored procedures, views, etc. Avoid invoking any external command sets without appropriate levels of logging and auditing enabled both in the application and in the supporting platform(s). Ensure correct data source Canonicalization issue Canonical representation issue relates to the ability of presenting resource names in various equivalent forms. The problem occurs when an application makes wrong decision based on a noncanonical representation of a name. Variances not always are captured by the software thus creating security flaws. It concerns first of all to server, user and data files/folders names. Problems with Encoded Characters The encoding of characters in a variety of standards (i.e. UTF-8, URLEncode and HTML Entities, ISO-8859-1, etc.) has made translating between those formats a nightmare for developers. Vulnerabilities have resulted from: Canonicalization Errors Double Encoding “Homographic” Spoofing 45 Canonicalization and Double Encoding Resource names supplied to an application from external sources have often been used in authorization decisions when accessing those resources. Early versions of those decisions simply checked for static matches (i.e. disallowing access to ‘/etc/passwd’ and stripping out extra periods). The support for encoded characters allowed attackers to bypass the authorization checks (i.e. using %2e instead of a period, and attacking with resource names such as ‘/home/%2e%2e/etc/passwd’) and access restricted resources. 46 Canonicalization and Double Encoding Continued The first, quickly-implemented solution involved a decode operation prior to the authorization check, which foiled attempts to inject encoded characters. …until it was discovered that encoding the encoded characters would allow their attack strings through the decode “fix”, and access the resource in the same fashion as their original attacks. Instead of limiting the conditions for success, the developers were attempting to counter negative cases as incidents arose, leaving their systems vulnerable in the interim. 47 “Homographic” Spoofing Internationalization support has created another form of encoding issue, where standards with broader character sets are translated to standards with fewer recognized characters (down-sized). 48 “Homographic” Spoofing Continued The loss of resolution from the larger format to the smaller made it possible for attacks such as domain name collisions. i.e. An original domain, foobar.biz, is registered and meets with commercial success. An attacker wishing to ‘phish’ the site (trick its users, through Social Engineering, into visiting his own site rather than the original) might register a domain name such as foόbar.biz, with a Greek accented “o”, which would resolve differently thanks to support for the larger standard among DNS servers. However, browsers displaying those characters in a reduced character set inadvertently make the two appear identically, despite resolving differently. Victims of these phishing attacks might not notice the redirection attempt due to a seemingly consistent domain name link. 49 Resolve canonicalization issue Make sure that the data input is canonicalized by the very first component that accepts the input. Make canonical representation before a name is being validated. Avoid making a decision based on file names. When required (and possible), have an operating system make the decision. Define strictly what the right name is, and use a regular expression to restrict what is allowed in a name. Always validate the entire filename “from the beginning to the end of the request”. For Windows, avoid auto- generated an 8.3 format filenames in order to resolve the FAT compatibility issue. Stop 8.3 filename generation (must be done at the Windows administrative level). Resolve canonicalization issue (cont) Do not use the PATH environment variable to find files. Always use the Full Path Names to your data and executable files. Canonicalize the name in order to get as close as possible to the file system’s representation of the file. Verify weather the filename refers to the file, not to a device or a link. Define the canonical version of server names and validate the names against it. Use canonical user’s name to validate a user account in the domain. For Windows, it must be the SAM name. Avoid double decoding of data Lab task Work on the spreadsheet re Vulnerability Cells H3, K3 – put your comments More venues leading to SQL injection Cells A9 and A25 – put your comments How to prevent command and SQL injection Email the spreadsheet Links http://www.owasp.org/index.php/Process_C ontrol http://www.owasp.org/index.php/Validation _performed_in_client http://www.owasp.org/index.php/Improper_ Data_Validation