International Conference on Global Trends in Engineering, Technology and Management (ICGTETM-2016) Web Vulnerability Scanner Using HTTP Method Pramod B. Gosavi#1, Harshala P. Patil*2 #1 Associate Professor & HOD&Information Technology Department&North Maharashtra University Godavari College of Engineering, Jalgaon,India #2 Student & Computer Science and Engg. & North Maharashtra University Godavari College of Engineering, Jalgaon,India Abstract—As the popularity of the World Wide Webincreases, website security become the most important feature to secure an organization and should be given higher priority. For website security purpose lots of web application tools are invented. These tools scan the vulnerabilities present in the sitebut still there is lack of perfect results. This paper presents one of the powerful web vulnerability scanners by using HTTP method. This paper also shows comparative results of different scanners available in the market. We proposed system for web vulnerability scanning with robust link crawler and improved detection accuracy. Proposed web vulnerability scanner is not only able to exercise more code of the web application, but also discover vulnerabilities that other vulnerability scanners not able to detect. Many times at the time of website development developers forget to remove sensitive datafrom website which is not supposed to be exposed to public users. Such data consists of untested vulnerableforms, database backup, and site backup in compressed format. A hacker always tries to find such kind of data andcollect important information from it like login detail. Keywords- Web Application Security, Automated URL Crawling, Automated Vulnerability Detection, Threat Detection, Automated Web Security Audit, Automated URL Crawling, Content Management System. I. INTRODUCTION Web application is the most popular way to deliver services via the Internet. A modern web applications are server-side part and mostly written in Java ,PHP, Ruby, or Python running on the server, and a client part running in the user’s web browser which are implemented in JavaScript and using HTML/CSS.Thepresence of bugs and vulnerabilities in the software applications can be checked by two main approaches [1,2] for the: • In white-box testing, the source code of the application is examined to detect vulnerable lines of code. • In black-box testing, the source code is not checked directly. Instead, special input test cases are generated and those test cases sent to the application. Then, the ISSN: 2231-5381 results returned by the application are examinedfor vulnerabilities. We have scanned the website by black box testingin an automated fashion. Black-box web vulnerability scanners in automated way are most popular way for finding vulnerabilities in web applications. If some security issues are not considered while developing the website, the database applications become more vulnerable to the sensitive information. Due to this hacker do the criminal activities through such vulnerable website by injecting malicious code or used to transfer illegal content. We have scanned website byusing different scanning options given belowa. SQL Injection Test b. Remote File Inclusion c. Malware Detection d. Local File Inclusion e. Cross Site Scripting II. RELATED WORK In order to be effective web application black-box security scanners developer should have a sophisticated understanding of the application[3]. Lots of industry effort has been dedicated to Web vulnerability Scanner. Manual vulnerability detection iscomplex because it requires high level of skill and ability to keep track of code used in a web application. Hackers are regularly searching for new ways to exploit your web-vulnerability statistics as well as it demands high level of skillto keep track of large amount of code used in a web application hackers are regularly searching for new ways to exploit your webvulnerability statistics [4] Results showed that selecting a vulnerability scanner for web vulnerability detection is a very difficult task. First, different scanners detect different types of vulnerabilities [5]. Feinstein analysed JavaScript obfuscation cases and implemented a tool for obfuscation detection [7]. III. METHODOLOGY How the system works is given belowWeb vulnerability scanners uses various techniques to fetch all possible URLs in remote website. It fetches the URLs by following methods. http://www.ijettjournal.org Page 486 International Conference on Global Trends in Engineering, Technology and Management (ICGTETM-2016) 1. Visiting pages linked on websites. vulnerability is also due to the use of user-supplied 2. Get links for Robots.txt files. input without proper validation. 3. Detect "Directory Indexing enabled" 3. Cross Site Scripting : folders and collect file URLs. It refers to a range of attacks in which the attacker 4. Get crawled links from search engine injects malicious JavaScript into a web application (Google, Bing etc.) 5. Directory guessing using Directory Traversing method. In all above techniques it uses recursive search method to find out the URLs. In recursive method provided option to control the simultaneous visit on the target site. If link visit count not controlled then target site may get overloaded due to DOS like attack. After collecting all the URLs, it analyses all the URLs andchecks URLs for whether it's eligible for SQL, XSS, LFI, RFI attacks. Accordingly it categories all URLs.Then it applies vulnerability injection methods (attacks) on categorized URLs. On positive response on attack, it then log URL as vulnerable. While checking the responses after sending vulnerable payloads, if continuously receiving 500 ( forbidden ) HTTP status due to target site overloadthen, scan engine pause the visiting links and waits for some time to get the site comes to normal. If still receiving 500 HTTP status then after particularcount it stops scanning the site. Figure 1: Architecture of proposed system Figure 1 explains the scanning methodology of the scan engine. It scans the multiple sites simultaneously. Our proposed system is having the following features1. Domain reputation: Logs vulnerability detected independently. It Scan Check whether domain is listed with above engine in overloaded then it puts the next site and databases. Above databases and queue and waits for scan engine to come to normal organizations stores IP address and domains load. which involves in malware, spamming, Figure 1.shows our system and methodology for phishing activities. vulnerabilitiesdetection present on web pages 2. Mail server IP Check in 58 RBL automatically. System includes different types of repositories: scanning which are mentioned as above. System is RBL (Real-time Black hole List) or DNSBL also having a unique feature called Web crawler (DNS-based Blackhole List) a list of IP which is mainly used by search engine to gather data addresses whose owners refuse to stop the for indexing purpose. proliferation of spam. From ISPs the RBL It is a program, which automatically traverses the web usually lists server IP addresses whose by downloading documents and following links from customers are responsible for the spam and page to page. from ISPs whose servers are hijacked for Let us discuss how different types of scanning done by spam relay. the scanner. 3. Scan SQL Injections for databases: SQL injection attacks are based on injecting strings It is a trick that exploits unwell filtered or not into database queries that alter their planned use. This correctly escaped SQL queries into parsing changes can occur if a web application does not variable data from user input. Currently properly sanitize (or filter) user input. checking for. Remote File Inclusion (RFI) is a vulnerability attack 4. Scan Local file injections (LFI): that targets the computer servers that run Web sites It injects files on a server through the web and their applications. browser. This vulnerability occurs when a 1. Malware Detection: page include is not properly filtered, and It is malicious and unwanted software. Scanner allows directory traversal characters to be helps to remove viruses, spyware, and injected. other malware. 5. Scan Remote file inclusion (RFI): 2. Local File Inclusion (LFI): It allows an attacker to include a remote file, It issame asa Remote File Inclusion vulnerability usually through a script on the web server. except instead of including remote files, only local The vulnerability detected due to the use of can be included. If web page is not properly user-supplied input without proper sanitized this type of vulnerability occurs.The ISSN: 2231-5381 http://www.ijettjournal.org Page 487 International Conference on Global Trends in Engineering, Technology and Management (ICGTETM-2016) authentication. This can cause Code execution on the web server, Code execution on the client-side such as JavaScript which can lead to other attacks such as cross site scripting (XSS), DoS, Data Theft etc. 6. Scan XSS - Cross Site Scripting: Type of computer security vulnerability typically found in Web applications. XSS enables attackers to inject clientside script into Web pages viewed by other users. Detects form on the Webpages and scan for GET and POST requests. Currently it scans for reflected XSS. We have future plan for Stored XSS. Stored XSS occurs when a web application gathers input from a user which might be infected, and then stores that input in a data store for future use. 7. Scan Malware: Website defacement check: Website defacement is also a type of attack on a website that changes the visual appearance of the site or a webpage. Forceful redirect injection test. Scans JavaScript code snippets against generic signatures: Checks for JavaScript dangerous functions like eval, base64_decode, char etc. Checks for Iframes. Special algorithm to detect JavaScript Obfuscation: Obfuscation used to convert vulnerable codes into unreadable format.[7] Third party links check: It checks third party links with reputation databases. 8. Detect and Scan CMS: Detect Wordpress, Joomla. Scan Themes, Plug-ins, unprotected admin area. User enumeration. FPD - File Path Disclosure scanning. 9. Scans Open Ports on the server 10. Banner Grabbing: Administrators can use this to take inventory of the systems and services on their network. An intruder can use banner grabbing in order to find network hosts that are running versions of applications and operating systems with known exploits. 11. Directory Scanning: The goal of this scan is to order an application to detect a computer file that is not intended to be accessible. This causes due to lack of security for directory access on the web server. 12. Detect sensitive URLs in the site: Scan for sensitive area of the sites which could not be open to all. e.g. Admin login pages. ISSN: 2231-5381 13. Scan Password auto complete enabled fields: Many websites have a login form where users provide username and password. The default behaviour for browsers is to allow users to store these credentials locally in the browser. Thereby, the next time a similar form appears, the username and password are already populated. With this it’s easy to steal the stored passwords from user’s browser. 14. Information disclosure: It checks for email address, IP addresses in the page. 15. Authenticated area scanning: Scan restricted areas like admin panels. It supports HTTP and Web-form based authentications. 16. Robust Link Crawler: Crawls links from web pages, robots.txt, iframes, hacker’s favourite search engines, directory indexes, directory traversals. 17. SSL Certificate checking: Scans a HTTPS service to enumerate what protocols and what ciphers the HTTPS service supports. It checks for weak ciphers, valid period for the certificate. 18. Backdoor WebShell Locater (Client Side Unique Feature): Scan for shells from client side for commonly injected locations and with their usual file names. .e.g. http://www.example.com/uploads/cmd99 .php 19. WebShell Finder: Scan each web page for particular keyword, so it can able to detect webshell, if renamed to some other name. e.g. http://www.example.com/uploads/myna me.php(myname.php is web shell) 20. Reverse IP domain check: Find out all other domains hosted on the same server (Server on which scanning domain hosted). Check these domains for black list. IV. RESULTS Following 2 tables shows the Features and Result Accuracy Comparison of Web Vulnerability Scanner with currently existing products. Feature Comparison shows the availability of features like SQL Injection, XSS in respective products. If available then mentioned as Yes, else No. Result accuracy comparison shows the % of detected vulnerability against total know vulnerabilities in the sites. e.g. If X sites has total 50 SQL injection vulnerable links, so proposed solution able to detect http://www.ijettjournal.org Page 488 International Conference on Global Trends in Engineering, Technology and Management (ICGTETM-2016) Table I Feature Comparison Web IBM Ac SqlM Burp Vulnerability AppS une ap Suite Scanner can tix ( WVS ) Using HTTP method Yes Yes Yes Yes Yes SQL Injection XSS Malware Page Deface Web Shell CMS ( Joomla / WordPress ) Reverse IP Domains Domain Reputation Featur es SQL Injecti on XSS Malwa re Page Deface Web Shell CMS ( Joom la / WordP ress ) Revers e IP Domai ns Domai n Reputa tion Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No No No No No No No No No No No No No No No No No No Yes Yes Yes TABLE II Result Accuracy Web IBM Acuneti Sql Vulnerability AppS x Map Scanner can ( WVS ) using HTTP method 85% 100% 90% 98% 98% 95% 98% 100% 100% 100% 100% 100% 100% 100% 95% 60% 100% 70% NA NA 100% 95% ISSN: 2231-5381 60% NA NA NA NA NA 95% NA 95% NA Burp Suite 100% 96% 30% 60% 70% 120% 100% 80% 60% 40% 20% 0% Web Vulnerability Scanner By Using HTTP method ( WVS ) SQL Injection XSS Malware LFI RFI Page Deface Web Shell Link Crawling 85% ( 43 ) vulnerable links. Some solutions like SqlMap don't have features so mentioned it as NA in comparison table. As per our result IBM AppScan was the only scanner to detect the SQL Injection, cross site scripting,Malware,Page deface,WebshellReverse IP domains vulnerabilities. It detected 6 of these cases but the Content Management System vulnerabilities are only detected by the our proposed scanner. Result of our proposed scanner was detected by considering false positives (i.e., situations where scanners detected avulnerability that in the reality does not exist). V. FUTURE SCOPE We can extend the proposed system for high accuracy, effectiveness and efficiency by implementing following thingsView statescanning: Scanning the View states in .Net/aspx sites for secure encryption methods. Scan applications hosted on Open Ports: Check if any vulnerability available for applications installed on Open Ports. Server side file scanning: File integrity checking. Agent will be installed on the server and it will scan files in specified folder with set interval. It will alert if any change found in file by comparing with stored snapshot of the folder. Password Type Submission method: Check if password submission to webserver is with encrypted form. Special scanning for .Net, asp sites. Page content change monitoring: Similar to file integrity, it will compare webpages on client side. NA NA 95% IBM AppScan VI. CONCLUSION This paper has presented enhanced solution with robust link crawler and more features than available solutions. Proposed solutions crawls all possible links in the website and apply different vulnerability scanning like SQL, XSS, LFI, RFI, Malware on the crawled URLs.It has minimized false positive results in the final result. Proposed solution also takes care of minimizing the target site load during scanning. http://www.ijettjournal.org Page 489 International Conference on Global Trends in Engineering, Technology and Management (ICGTETM-2016) ACKNOWLEDGMENT I am thankful to Prof. Pramod B. Gosavi HOD of Information Technology department and researchmy Guide for having permitted me to carry out this work. I express my deep sense of gratitude for his able guidance and useful suggestions, which helped me in completing the work, in time. I express my sincere gratitude to all the Teaching and Non-teaching Staff members of Computer Engineering Department who helped out me for their valuable time, support, suggestions. Finally, yet importantly, I would like to express my heartfelt thanks to God, my parents for their blessings for the successful completion of this task. REFERENCES [1] GARRETT, J. J. Ajax: A New Approach to Web Applications. [2] [3] Conference on Detection of Intrusions and Malware, and Vulnerability Assessment, 1-21. Retrieved June 31, 2014. [4] Web Application Security Statistics. Web Application Security Consortium. [Online]. Available: http://projects. Webappsec.org/Web-Application-Security-Statistics. [5] M Marco Vieira,Nuno Antunes,Henrique Madeira : Using Web Security Scanners toDetect Vulnerabilities in Web Services [6] Feinstein, B., Peck, D.: Caffeine Monkey: Automated Collection, Detection and [7] YoungHan Choi, TaeGhyoon Kim, SeokJin Choi : Automatic Detection for JavaScript Obfuscation Attacks in WebPages through String Pattern Analysis [8] OWASP Foundation. (2014c). A3 – Cross Site Scripting. Retrieved September3, 2014, from https://www.owasp.org/index.php/Top_10_2013-A3-CrossSite_Scripting_(XSS) [9] Carlo Ghezzi, Mehdi Jazayeri, and Dino Mandrioli. Fundamentals of Software Engineering. Prentice-Hall International, 1994. [10] David Endler. The Evolution of Cross Site Scripting Attacks. Technical report, iDEFENSE Labs, 2002 http://www.adaptivepath.com/ideas/essays/ archives/000385.php, Feb. 2005. Stefan Kals, Engin Kirda, Christopher Kruegel, and Nenad Jovanovic ―SecuBat: A Web Vulnerability Scanner‖ Doupe, A., Cova, M., & Vigna, G. (2010):An Analysis of Black-box Web Vulnerability Scanners. 7th International ISSN: 2231-5381 http://www.ijettjournal.org Page 490