G.H. RAISONI COLLEGE OF ENGINEERING & MANAGEMENT, PUNE Department of Computer Engineering < Phishing Website Detector Using ML> Guide: Prof.Nivedita Kadam Co-Guide: Prof.Gayatree Bedre Name of Projectees BCOB81-Safin Tamboli BCOB85-Sarthak Pawar BCOB82-Sairaj Jadhav BCOB90-Shoeb Shah Contents Introduction Justifications for Selecting the Title Problem Statement Examples of Phishing Websites Literature Survey Block Diagram Expected Result Work plan Future Scope References Introduction Nowadays Phishing becomes a main area of concern for security researchers because it is not difficult to create the fake website which looks so close to legitimate website. Experts can identify fake websites but not all the users can identify the fake website and such users become the victim of phishing attack. Main aim of the attacker is to steal banks account credentials. Phishing attacks are becoming successful because lack of user awareness. Since phishing attack exploits the weaknesses found in users, it is very difficult to mitigate them, but it is very important to enhance phishing detection techniques. Justification For Selecting The Title The main purpose of the project is to detect a website created by any phisher for hacking data from a user and making aware the user of such threats once detected. It proposes to prove much beneficial for users for safe browsing and keeping their data untouched by any phisher who is trying to use the user’s credentials in illegal means. So the title mentioned clearly gives the ideology and goal of our project. i.e. “Phishing Website Detector Using ML)” Problem Statement Machine learning technology consists of a many algorithms which requires past data to make a decision or prediction on future data. Using this technique, algorithm will analyze various blacklisted and legitimate URLs and their features to accurately detect the phishing websites including zero- hour phishing websites. The following were the questions we will be proposing to solve through this project1. How URL detectors identify the phishing URLs or websites? 2. How to apply ML methods to classify malicious and legitimate websites? 3. How to evaluate a URL detector performance? Examples of Phishing Websites1. Phishing Website send via mail- 2. Phishing Website sent via SMS- Literature Survey Sr. No. 1. 2. 3. 4. Paper title & its author Title: Detecting phishing websites using machine learning technique Methodology Advantages Future Scope The proposed framework employs RNN- The outcome of this study reveals The future direction of this study is to LSTM to identify the properties Pm and that the proposed method presents develop an unsupervised deep learning Pl in an order to declare an URL as superior results rather than the method to generate insight from a URL. malicious or legitimate. existing deep learning methods Author: Ashit Kumar Dutta URLs of benign websites were collected We have implemented python In future hybrid technology will be Title: from www.alexa.com and The URLs of program to extract features from implemented to detect phishing websites Phishing Website Detection using Machine phishing websites were collected from URL. Below are the features that we more accurately, for which random forest www.phishtank.com. have extracted for detection of algorithm of machine learning Learning Algorithms phishing URLs. technology and blacklist method will be And Classifies using Decision Tree as Authors: used. Splitter. Rishikesh Mahajan Title :Phishing Website Detection Using Machine Learning Classifiers Optimized by Feature Selection. Authors: Dželila Mehanović* | Jasmin Kevrić Title: AN APPROACH FOR DETECTING PHISHING ATTACKS USING MACHINE LEARNING TECHNIQUES. Author: K.Venkateshwara Rao To select features, we used the Weka tool and its algorithms for feature selection. In our approach, to find most This is important, as we hope with a valuable features we used multiple decrease in the number of features, we feature selection filters. The outputs decreased time needed to build a model. of these filters are analyzed and To perform phishing websites detection, features that are proposed as most in this work we applied K-Nearest important. Neighbor (KNN) It consists of the parallel decision tree which take the input and produce a specific class. Thus, n number of trees produce different classes. Support vector machine gives an accuracy of 91.3% on test data set. This helps in providing accuracy. In the future, we can find a better way to find a phishing website by using advanced features of the URL. Block Diagram Block diagram of various stages of projectDetecting Phising Websites using ML Phising Website Detector Using ML Algorithms Phistank Malicious URLs Feature Extraction Legitimate URLs Crawler Data Emails/SMS/ Enterprises RNN Training Phase RNN & Random Forest Testing Phase Evaluating The Result Future Scope1.Creating a safe user friendly environment which can detect illegitimate activities. 2.It is possible to report and block a hacker using phishing website URL and tracing the location of such anonymous hackers. 3.Awareness can be created among users by displaying certain type of Phishing URLs available or cause more harm to our system like zero hour phishing websites. Expected Result System Description: Detecting Websites/URLs Input: URLs, Random websites, Transaction IDs, Suspicious Mails Output: Safe for Browsing (Continue) / Unsafe For Browsing (Block Website) Possible Success Conditions: Developing a cautious way of browsing on internet, checking random URLs forwarded on our mails or social media. Failure Conditions: New format of phishing website may go undetected. Work – Plan Months Activities Literature Reviews AUG’22 SEP’22 √ √ Component Identification & Selection √ Designing √ Experimental Analysis Fabrication Testing and Debugging Preparation of Project Report OCT’22 NOV’22 √ √ √ √ DEC’23 JAN’23 FEB’23 References 1. Anti-Phishing Working Group (APWG), https://docs.apwg.org//reports/apwg_trends_report_q4_2019. pdf 2. Jain A.K., Gupta B.B. “PHISH-SAFE: URL Features-Based Phishing Detection System Using Machine Learning”, Cyber Security. Advances in Intelligent Systems and Computing, vol. 729, 2018, https://doi. org/10.1007/978-981-108536-9_44 3. Purbay M., Kumar D, “Split Behavior of Supervised Machine Learning Algorithms for Phishing URL Detection”, Lecture Notes in Electrical Engineering, vol. 683, 2021, https://doi.org/10.1007/978-981- 15-6840-4_40 4. Gandotra E., Gupta D, “An Efficient Approach for Phishing Detection using Machine Learning”, Algorithms for Intelligent Systems, Springer, Singapore, 2021, https://doi.org/10.1007/978-981-15-8711-5_ 12. 5. Hung Le, Quang Pham, Doyen Sahoo, and Steven C.H. Hoi, “URLNet: Learning a URL Representation with Deep Learning for Malicious URL Detection”, Conference’17, Washington, DC, USA, arXiv:1802.03162, July 2017. 6. Hong J., Kim T., Liu J., Park N., Kim SW, “Phishing URL Detection with Lexical Features and Blacklisted Domains”, Autonomous Secure Cyber Systems. Springer, https://doi.org/10.1007/978-3-030-33432- 1_12. 7. J. Kumar, A. Santhanavijayan, B. Janet, B. Rajendran and B. S. Bindhumadhava, “Phishing Website Classification and Detection Using Machine Learning,” 2020 International Conference on Computer Communication and Informatics (ICCCI), Coimbatore, India, 2020, pp. 1–6, 10.1109/ ICCCI48352.2020.9104161. Thank you !