International Journal of Advancements in Research & Technology, Volume 2, Issue4, April‐2013 438 ISSN 2278‐7763 Container and Virtualization Concept for Bi-filter Intrusion Detection with Caching of Web Requests in Relational Database 1 I. Jasmine Selvakumari Jeya, 2 Harsha Thomas 1Research Scholar, Dept. of CSE, Hindusthan College of Engineering and Technology, Coimbatore-32, Tamil Nadu, India, wjasminejeya@gmail.com 2 P.G Student, Dept. of CSE, Hindusthan College of Engineering and Technology, Coimbatore-32, Tamil Nadu, India, thomasharsha26@gmail.com ABSTRACT In multi‐tier web architecture often referred to as n‐tier architecture, the back‐end database server are kept protected behind a firewall and web application made it possible for user to access set of services from web servers which are remotely accessible over the Internet. The current IDS system installed at web server and at database server is unable to detect intrusions where a normal traffic is used for attacking back end database. Though they are protected from direct remote attacks, the back‐end systems are susceptible to attacks that use web requests as a means to exploit the back‐end. Existing prevention systems are often insufficient to protect this class of applications, because the security mechanisms provided are either not well‐understood or simply disabled by the web developers to get the job done. Therefore, prevention mechanisms should be complemented by intrusion detection systems, which are able to identify attacks and provide early warning about suspicious activities. An approach of Bifilter proposed is based upon the mapping model which maps the web request along with set of resultant query invoked by that request within an individual session. The mapping model it can be used to detect abnormal behaviors. In this paper we proposed a new caching paradigm called reference point caching whereby information about a document is cached at a point where the document is referenced. Our motivation is to reduce latency by avoiding unnecessary protocol steps. We proposed two specific instances of this scheme: caching IP addresses and caching documents themselves. Keywords : Component; Formatting; Style; Styling; Insert (keywords) 1 INTRODUCTION system for detecting computer intrusions and misuse by Web based attacks have recently become more diverse, as attention has shifted from attacking the front‐end monitoring system activity and classifying it as either to exploiting vulnerabilities of the web applications in order to normal or anomalous [2]. A statistical anomaly‐based IDS corrupt the back‐end database system. Intrusion detection determines normal network activity like what sort of plays one of the key roles in computer system security bandwidth is generally used, what protocols are used, what techniques. An intrusion detection system (IDS) is a device ports and devices generally connect to each other‐ and alert or software application that monitors network or system the administrator or user when traffic is detected which is activities for malicious activities or policy violations and anomalous. produces alerts. Bifilter intrusion detection has been achieved by An intrusion detection system (IDS) differs from a employing a virtualization. It assigns each user’s web session to firewall in that a firewall looks outwardly for intrusions in a dedicated container. Container is an isolated virtual order to stop them from happening. Firewalls limit access computing environment. Each container will be having unique between networks to prevent intrusion and do not signal an container ID. This unique container ID can be used to attack from inside the network. An IDS evaluates a suspected accurately associate the web request with the subsequent DB queries. Thus, Bifilter can build a causal mapping profile intrusion once it has taken place and signals an alarm. An IDS also watches for attacks that originate from by taking both the web server and DB traffic into account within a system. This is traditionally achieved by examining An alternative is lightweight virtualization, generally based on network communications, identifying heuristics and patterns some sort of container concept. With containers, a group of processes still appears to of common computer attacks, and taking action to alert operators. A system that terminates connections is called have its own dedicated system, but it is really running in a an intrusion prevention system, and is another form of specially isolated environment. All containers run on top of an application layer firewall. There are two general approaches the same kernel. With containers, the ability to run different to intrusion detection: anomaly detection and misuse detection operating systems is lost, as is the strong separation between [9]. A signature based IDS [6] works similar to anti‐virus virtual systems. To reduce web access latencies [12], a new software. It employs a signature database of well‐known paradigm for caching at the reference point of a document. attacks, and a successful match with current input raises an If a document X is referred to from a document Y , information alert [4]. An anomaly based intrusion detection system is a is cached at Y to reduce the latency of client accesses to X. Copyright © 2013 SciResPub. International Journal of Advancements in Research & Technology, Volume 2, Issue4, April‐2013 439 ISSN 2278‐7763 SNO 1. SIGNATURE BASED DETECTION Catches the intrusion based on signature Pat‐ tern of known attack MISUSE DETECTION Catches the intrusion in terms of charac‐ teristics of Known attack i.e. knowledge based 2. Manual i.e. Integrate the Human knowledge 3. High accuracy in detecting known attack 4. 5. Manual i.e. Integrate the Human knowledge ANOMALY DETECTION Detect any action that significantly de‐ viates from the normal behavior Automatic i.e. Self learning High accuracy in detecting unknown High accuracy in detecting unknown attack attack Computationally less expensive Computationally expensive Computationally expensive Not able to detect zero day attack Able to detect zero day attack Able to detect zero day attack 6. Law FPR Law FPR High false alarms 7. Does not require Training Does not require Training Require initial training 8. White Box Approach White Box Approach Black Box Approach 9. Classified alerts Classified alerts Unclassified alerts Table 1. Comparision of Different Anomaly Detection Techniques 2 RELATED WORK A network intrusion detection system can be running in different VEs from sharing memory, send signals classified into two types: signature detection and anomaly or communicate with IPC facilities. Under this requirement, detection [3]. Anomaly detection first requires the IDS to VEs interact with each other or remote hosts using the same define the characterize the correct and acceptable static mechanisms like machines in a distributed computing system: from dynamic behavior of the system. It is used to detect the through data sharing or socket connections. This models such abnormal behavior of the system. Thus [5] first define the VE interactions as database transactions. Giovanni Vigna, William Robertson [10] describes normal behavior of the system and create profile of the user In early IDS system that use the independent IDS WebSTAT, a STAT‐based intrusion detection system that supports the modeling and detection of sophisticated attacks. used. F. Valeur, Vigna, C. Krugel, and R.A. Kemmerer [4] WebSTAT operates on multiple event streams, and it is able to considers intrusion alerts correlation that transform intrusion correlate both network‐level and operating system‐level detection sensor alerts into succinct intrusion reports in order events with entries contained in server logs. D Wagner [11] to reduce the number of replicated alerts, false positives, and derives the specification of expected system calls bystatically analyzing the source code. non relevant positives. Meixing Le, Angelos Stavrou, Brent Byung Hoon Marco Cova, Davide Balzarotti, Viktoria Felmetsger, and Giovanni Vigna [7] proposed a novel approach which is Kang [1] proposed a new approach called Double guard to based upon detailed characterization of the internal state of a detect intrusions in multitier web applications. This approach web application, by means of a number of anomaly models. assumes that there is causal mapping of web requests and Web application internal state is defined as information that resulting SQL queries in a given session. And above modeled survives single client and server interaction or simply the attack can be readily detected if the database IDS can information associated with single user session. The minimum determine that a privileged request from the web server is not state information is passed as a cookie to a browser. Minimum associated with user‐privileged access.This approach does context information such as a session ID must be passed not require input validation, source code validation and between the browser and the server to identify the rest of the know the application logic. This identifies the causal state information. The key point here is it is easy to model out relationship between web server request and database request. typical intrusion scenario by keeping track of all states in These approaches dynamically generate new containers and which that intrusion is normally executed. recycle the used ones. Angelos Stavrou [8] stated interactions among VEs The Table 1 illustrates the Summary of different are modeled as transactions. It is a requirement that the anomaly detection techniques. Its merits and demerits are underlying virtualization technologies prohibits processes determined and compared. Copyright © 2013 SciResPub. International Journal of Advancements in Research & Technology, Volume 2, Issue4, April‐2013 440 ISSN 2278‐7763 3 METHODOLOGY cache sitting in front of it: The browser checks to see if the image is cached locally. If yes and the image are not stale, The First thing is to set up threat model to include the browser uses the image from its cache. Otherwise, the the assumptions and the types of attacks that are trying to browser sends the request for the image to the website. protect against. Figure 1 illustrates the classic three‐tier model. Since there is a transparent proxy cache, the request At the database side, it is unable to tell which transaction will be intercepted by the proxy cache. The transparent proxy corresponds to which client request. The communication cache checks to see if it has the image. If yes and the image are between the web server and the database server is not not stale, the proxy cache sends the image to the browser, separated, and can hardly understand the relationships which in addition to using caches it. Otherwise, the proxy among them. According to Figure. 1, If Client 2 is malicious cache sends the request for the image to the website where it is and takes over the web server, all subsequent database intercepted by the reverse proxy cache. transactions become suspect, as well as the response to the client. Caching V.E Web Application Bifilter are able to ferret out attacks that even SQL Web Server Firewall Database independent IDS would not be able to identify. This approach Http Request can create normality models of isolated user sessions that include both the web front‐end and back‐end network transactions. This uses container‐based and session‐separated Web Http Response Client web server architecture that not only enhances the security Http Request performances but also provides us with the isolated information flows that are separated in each container session. Database Http Response It allows us to identify the mapping between the web server Web Client requests and the subsequent DB queries, and to utilize such a Database Plugins: connection: Apache Perl ADO,ODBC,etc . IIS mapping model to detect abnormal behaviors on a c/c++ Netscape etc session/client level. Rq1 Figure 2. Bifilter Intrusion Detection System with Caching When the transparent proxy cache gets the image, it sends it to the browser and also caches it. The reverse proxy Rq2 cache checks to see if it has the image. If yes and the object are not stale, the reverse proxy cache sends the image to the Client 2 Rs2 Database requesting transparent proxy cache. Otherwise, the reverse Rq3 Replies proxy cache gets the image from the website, sends it to the requesting proxy cache, and caches the image. Note that in Web server Rs3 Database Server each case, if the cache size is exceeded, the cache will have to Client 3 throw out one or more cached objects so as to cache a new object. Typically the objects discarded are the ones that are Figure.1 Three Tier Architecture used infrequently or ones that have not been used for a long This architecture put filters at both sides of the time. servers. At the web server, our filters are deployed on the host Client 1 Rs1 Database Queries system and cannot be attacked directly since only the 3.1 Applying Virtualization Concept virtualized containers are exposed to attackers. These filters The OpenVZ network virtualization layer is designed will not be attacked at the database server either, as this exist to isolate Container (CT) from each other and from the an assumption that the attacker cannot completely take control physical network: Each Container has its own IP address; of the database. multiple IP addresses per CT are allowed. Network traffic of a It will identify when there are such sessions so that it CT is isolated from the other CTs. In other words, containers may have false positives in that detection. The number of false are protected from each other in the way that makes traffic positives depended on the size and coverage of the training snooping impossible. Firewalling may be used inside a CT (the sessions. Finally, this Bifilter application reduced the the false user can create rules limiting access to some services using the positives for both static and dynamic pages. Suppose that a canonical ip tables tool inside a CT). In other words, it is pos‐ user’s browser needs an image for a Web page (Server). The sible to set up firewall rules from inside a CT. Routing table browser is caching, all its requests are funneled through a manipulations and advanced routing features are sup‐ transparent proxy cache, and the website has a reverse proxy ported for individual containers. Copyright © 2013 SciResPub. International Journal of Advancements in Research & Technology, Volume 2, Issue4, April‐2013 441 ISSN 2278‐7763 3.2 Create Container Model pattern is then rm to Qn. In static websites, this type of This make use of lightweight process containers mapping comprises the majority of cases since the same referred to as containers as ephemeral, disposable servers for results should be returned for each time a user visits the same link. client sessions. It is possible to initialize thousands of containers on a single physical machine, and these virtualized Empty Query Set In special cases, the SQL query set may containers can be discarded, reverted, or quickly reinitialized be the empty set. This implies that the web request neither to serve new sessions. A single physical web server runs many causes nor generates any database queries. containers, each one an exact copy of the original web server. No Matched Request In some cases, the web server may This approach dynamically generates new containers and periodically submit queries to the database server in order recycles used ones. As a result, a single physical server can run to conduct some scheduled tasks, such as cron jobs for continuously and serve all web requests. This container‐based archiving or backup. Nondeterministic Mapping The same web request may result and session‐separated web server architecture not only enhances the security performances but also provides us with in different SQL query sets based on input parameters or the isolated information flows that are separated in each the status of the webpage at the time the web request is received. In fact, these different SQL query sets do not appear container session. randomly, and there exists a candidate pool of query sets It allows us to identify the mapping between the web server requests and the subsequent DB queries, and to (e.g., {qn,qp. . .}). utilize such a mapping model to detect abnormal behaviors 3.4 Based Attacks Web on a session/client level. It want to model such causal mapping The different types of attacks are in the based attacks relationships of all legitimate Figure.2 depicts how web. There are: communications are categorized as sessions and how Path Traversal Attack In a path traversal attack, an intruder database transactions can be related to a corresponding session. manipulates a URL in such a way that the Web server executes Figure.2, Client 2 will only compromise the VE 2, and the or reveals the contents of a file anywhere on the server, corresponding database transaction set T2 will be the only including those lying outside the document root directory. affected section of data within the database. Path traversal attacks take advantage of special‐characters It is impossible for a database server to determine sequences in URL input parameters, cookies and HTTP which SQL queries are the results of which web requests, much request header. The most basic path traversal attack uses the less to find out the relationship between them. However, ʺ../ʺ character sequence to alter the document or resource within our container‐based web servers, it is a straightforward location requested in a URL. Although most Web servers matter to identify the causal pairs of web requests and prevent this method from escaping the web document root, resulting SQL queries in a given session. Moreover, as traffic alternate encodings of the ʺ../ʺ sequence, such as Unicode‐ can easily be separated by session, it becomes possible for us encoding, can bypass basic security filters. This can be to compare and analyze the request and queries across prevented by blocking requests that contain unsafe characters, different sessions. Thus the mapping model, it can be used to also by disabling the parent paths setting, which prevents the detect abnormal behaviors. Both the web request and the use of ʺ..ʺ in script and application calls. database queries within each session should be in accordance Privilege Escalation Attack Suppose that an attacker logs into with the model. If there exists any request or query that the webserver as a normal user as in Figure.3 , upgrades violates the normality model within a session, then the session his/her privileges, and triggers admin queries so as to obtain will be treated as a possible attack. an administrator’s data. This attack can never be detected 3.3 Mapping Relations by either the web server IDS or the database IDS since both ru and Qa are legitimate requests and queries. In Bifilter these classify the four possible mapping patterns. Since the request is at the origin of the dataflow treat each request as the mapping source. In other word, the mappings in the model are always in the form of one request to a query set Mapping relation explain about how the request and corresponding query are matched, causal relationship between rm to {qn,qp}.Here qn,qp are mention the different database query. The possible mapping patterns as follows. Deteministic Mapping This is the most common and perfectly matched pattern. That is to say that web request rm appears in all traffic with the SQL queries set Qn. The mapping Copyright © 2013 SciResPub. 1. User Request Use Level Process 5. Response Attack Step 1 Attacker 2. Privilege Escalation Attack Step 2 3. Admin Queries Admin Level Process 4. Database Replies Figure 3.Privilage Escalation Attack Database Server International Journal of Advancements in Research & Technology, Volume 2, Issue4, April‐2013 442 ISSN 2278‐7763 were to go through the web server side, it would generate Hijack Future Session Attack In Figure.4 attacker takes web SQL queries in a different structure that could be detected server by hijack the other user sessions by sending spoofed as a deviation from the SQL query structure that would replies. In double guard it is detected by causal mapping a normally follow such a web request. The injection attack is request without query it is not accepted. Fortunately, the shown in the Figure.6 and Figure. 7 examples for the injection isolation property of our container based web server attack. architecture can also prevent this type of attack. Queries Bypass Web Server Attack 1. Took Over the Server Tainted Process Attacker 1. User Request 2. 2. User Queries Attacker 2. User Request 4. Bogus Reply Normal User Session Hijacked Normal User Web Server Database Server 3. Query Replies 4. Response 3. Queries Dropped or Hijacked Web Server Database Server Figure.6.Injection Attack Figure.4.Hijacked Future Sesssion Attack Direct DB Attack It is possible for an attacker to bypass the web server or firewalls and connect directly to the database. An attacker could also have already taken over the web server and be submitting such queries from the web server without sending web requests. Without matched web requests for such queries, a web server IDS could detect neither. Furthermore, if these DB queries were within the set of allowed queries, then the database IDS it would not detect it either. However, this type of attack can be caught with our approach since we cannot match any web requests with these Figure.7.Injection Attack Example queries as in Figure.5. Cross‐site Scripting (XSS) This enables attackers to inject client‐side script into Web pages viewed by other 1. User Request 2. Database Queries With Injection With Injections Injection users. The primary defense mechanism to stop XSS is contextual output encoding/escaping. There are several different escaping schemes that must be used depending on 3. Privileged Replies 4. Response where the untrusted string needs to be placed within an HTML document including HTML entity encoding, Java Script Attacker escaping, CSS escaping, and URL (or percent) encoding. Database Server Web Server Most web applications that do not need to accept rich data can use escaping to largely eliminate the risk of XSS in a Figure.5.Direct DB Attack fairly straightforward manner. All network traffic from both Injection Attack Attackers can use existing vulnerabilities in legitimate users and adversaries, is received intermixed at the the web server logic to inject the data or string content that same web server first, we tried to categorize all of the potential contains the exploits and then use the web server to relay these single (atomic) operations on the web pages. All of the exploits to attack the back‐end database. Since our approach operations that appear within one session are permutations of provides a two‐tier detection, even if the exploits are these operations. If we could build a mapping model for each accepted by the web server, the relayed contents to the DB of these basic operations, then we could compare web requests server would not be able to take on the expected struc‐ to determine the basic operations of the session and obtain ture for the given web server request. the most likely set of queries mapped from these operations. For instance, since the SQL injection attack changes If these single operation models could not cover all of the the structure of the SQL queries, even if the injected data requests and queries in a session, then this would indicate a Copyright © 2013 SciResPub. International Journal of Advancements in Research & Technology, Volume 2, Issue4, April‐2013 443 ISSN 2278‐7763 possible intrusion. Figure 8.The Detection of Privilege Escalation Attack Figure.9 Hijack Future Session Attack Figure.10 The Denial of Service attack name S. If the DNS mapping is not available in the client DNS cache, the query is sent to the local DNS server in the client domain; if S is not cached in the local name server, the DNS query may be sent to the root server and then to the authoritative server for S in S’s domain. Our measurements indicate that DNS query times can be very large, up to several seconds. In this new mechanism, the reference point is allowed to have a cached copy of the page at S. If R has cached the page, it indicates that the annotating its link to S with a flag that indicates that the page is “locallyCached”. This flag is used by the client browser but is not displayed to the user. If the client browser decides to retrieve the cached page at R, the browser can do so using the same connection it already has to R. This not only avoids a connection set up delay but also makes it more likely that the congestion window of the TCP connection is high enough to sustain higher throughput. 4.1 Caching IP Address When a client contacts a server for the first time, it has to lookup the IP address of the server, since URLs provide only server names. This lookup can take hundreds of milliseconds if the address is not already cached locally In reference point caching of IP addresses, a server, such as a search engine, include the IP addresses of all the hosts in the page it supplies. The server can preprocess static pages to include the IP addresses in the page, and while generating dynamic pages, it can include the IP addresses by looking up its local DNS. To avoid latency in generating these pages, the server should not include an address for a link if the address is not currently in its local DNS cache. To reduce the DNS traffic originating from search engines and to reduce latency, recommend that the search engine should run a modified name server which prefetches IP addresses of frequently queried host names before they expire. 4.2 Caching Documents In reference point caching of documents, if a document X at server S refers to a document Y at server S, Y can be cached at R, thus allowing a client that accesses The Figure 8. Indicates how the attacker increases his document X from R to retrieve document Y from the server privileges in an unauthorized way and how privilege escalation attack detected. The Hijack Future session attack is R itself without making an additional connection to S. depicted in the Figure.9.All the future session will be hijacked by attacker. In the Figure.10. it shows how the service is denied for a requested client. 5 CONCLUSION In this way we surveyed few techniques which are meant for intrusion detection against multitier web 4 REDUCING LATENCY USING REFERENCE POINT applications. Some of the technique use single IDS to de‐ CACHING tect and prevent web server from malicious request while Consider client C browsing through a page at R some approach use combined approach to detect intrusions (called the reference point) which has a link to a page on at both web and database level. Apart from all above server S. If C decides to browse the page at S, the standard discussed approach the last approach is having some mechanism for C is to first initiate a DNS query for the host‐ additional detection capability to detect attack where Copyright © 2013 SciResPub. International Journal of Advancements in Research & Technology, Volume 2, Issue4, April‐2013 444 ISSN 2278‐7763 normal traffic is used as means to launch database attack. Technology, Coimbatore. Her area of interest and project fo‐ Because of container based and session separated approach of cuses on security issues in relational database. Bifilter use multiple input streams to produce alerts. Such correlation of different data streams provides a better characterization of the system for Anomaly detection because the intrusion sensor has a more Reference point caching’s motivation is to reduce latency by avoiding unnecessary protocol steps. REFERENCES [1] Angelos Stavrou, Meixing Le , Brent Byungghoon Kang George Mason University “Doubleguard: Detecting Intrusions In Multi‐ TierWeb Applications” July/August 2012. [2] A. Seleznyov and S. Puuronen, “Anomaly Intrusion Detection Sys‐ tems: Handling Temporal Relations between Events,” Proc. Int’l Symp. Recent Advances in Intrusion Detection (RAID ’99), 1999. [3] C. Kruegel and G. Vigna, “Anomaly Detection of Web‐Based Attacks,” Proc. 10th ACM Conf. Computer and Comm. Security (CCS ’03), Oct. 2003. [4] F. Valeur, G. Vigna, C. Kru¨ gel, and R.A. Kemmerer, “A Compre‐ hensive Approach to Intrusion Detection Alert Correlation,”IEEE Trans. Dependable and Secure Computing, vol. 1, no. 3,pp. 146‐169, July‐ Sept. 2004 [5] G. Vigna, F. Valeur, D. Balzarotti, W.K. Robertson, C. Kruegel, and E. Kirda, “Reducing Errors in the Anomaly‐Based Detection of Web‐ Based Attacks through the Combined Analysis of Web Requests and SQL Queries,” J. Computer Security, vol. 17, no. 3, pp. 305‐329, 2009. [6] H. Debar, M. Dacier, and A. Wespi, “Towards a Taxonomy of Intrusion‐Detection Systems,” Computer Networks, vol. 31, no. 9,pp. 805‐822, 1999 . [7] M. Cova, D. Balzarotti, V. Felmetsger, and G. Vigna. Swaddler: An Approach for the Anomaly‐based Detection of State Violations in Web Applications. In RAID 2007. [8] Y. Huang, A. Stavrou, A. K. Ghosh, and S. Jajodia. Efficiently tracking application interactions using lightweight virtualization. In Proceedings of the 1st ACM workshop on Virtual machine security, 2008. [9] T. Verwoerd and R. Hunt. Intrusion detection techniques and approaches.Computer Communications, 25(15), 2002. [10] G. Vigna, W. K. Robertson, V. Kher, and R. A. Kemmerer. A stateful intrusion detection system for world‐wide web servers. In ACSAC 2003.IEEE Computer Society. [11] ] D. Wagner and D. Dean. Intrusion detection via static analysis. In Symposium on Security and Privacy (SSP ’01), May 2001. [12] Girish P. Chandranmenon George Varghese Reducing Web Latency Using Reference Point Caching Bell Laboratories University of California, San Diego. AUTHOR’S PROFILE I.JASMINE SELVAKUMARI JEYA is a Research Scholar and Assistant Professor in Department of Computer Science and Engineering at Hindusthan College of Engineering and Technology, Coimbatore. Her research work focuses on security issues in various database using optimization techniques. HARSHA THOMAS is a PG student doing M.E in Department of Computer Science and Engineering at Hindusthan College of Engineering and Copyright © 2013 SciResPub.