proposalV2

advertisement
Master Project Proposal
PhishLurk:
A Mechanism for Classifying and Blocking
Phishing Websites
by: Mohammed Alqahtani
1. Committee Members and Signatures:
Approved by
Date
__________________________________
Advisor: Dr. Edward Chow
_____________
__________________________________
Committee member: Dr. Albert Glock
_____________
__________________________________
Committee member: Dr. Chuan Yue
_____________
Introduction
Phishing is a cybercrime done by person or company to steal highly sensitive information
such as usernames, passwords and credit card details. Mostly, phishing attacks come into two
types emails and webpages that spoof or lure the user to enter sensitive information. On other
words, phishing is directing users to fraudulent web sites in order to get the sensitive
information. Users are increasingly using the internet to do their daily task such bills payment,
banking, socializing. As result of there are more and more personal information will be used for
different purposes which mean expand the surface of target for phishing.
sample of a phishing website (source: www.phishtank.com)
Phishing has been a major concern in the IT security. In the U.S., companies lose more
than $2 billion every year as result of phishing attacks [6]. Phishing works because of many
reason, one of the most common reason is the users’ carelessness and the users ignorance about
how to differentiate whether the website is phishing or not [1] . Moreover, there are long lists of
website that are hard to detect.
There are many research have been proposed focusing on anti-phishing, using different
methods of filtering and detecting such as black list, plugs-in, extensions and toolbars for
browsers [2]. Desktop browsers’ Developers try hard to provide a solid protection such as
warning the user by displaying a box massage if the website potentially is a phishing websites or
invalid or expired SSL certificates. Mostly a third party and black-list are involved to display and
identify phishing websites [3].
Recently, Users started to have more varieties of access to surf the internet for example
notebooks, PC game, handhelds and smartphones , However; using more varieties of devises
made in different abilities and features leads to complicate and sophisticate providing a full
protection, especially from phishing attacks methods. Yet there is no such a complete protection.
One of the most used devices is smartphones. According to a survey of ComScore, Inc. the
number of smartphones subscribers increased 60 percent in 2010 compared to 2009 [4]. Another
report by Nielsen Company indicates that by 2011 half of cell-phones users would be using
smartphones [5].
Figure explains the global rapid growth of smartphones market 2009 - 2010
Users started to use these types of access to do their activities and tasks due to the
advantages they provide i.e. smartphone preferred to use because of the easiness, flexibility and
mobility that smartphone have. Some activates such as online banking, paying bills, online
shopping and emailing [5] demand users need to enter sensitive information to complete the
authentication and authorization process, sensitive information could be credit-cards numbers,
password and usernames. In fact, having many types of devices to access the internet would
expand the surface for phishing attackers and complicate the protection.
Rellated Work
PhishTank is a unprofitable project aimed to build dependable database of phishing
websites [7], the project is to collect, verify, track and share phishing data. In order to report a
phishing links, the user has to be register as a member. So the admin can learn and judge each
member's contribution. The phishing websites can be reported and submitted via emails or
PhishTank’s websites. The data are verified by committee after they are submitted by the
members. Phishtank’s database can be shared via the API. The links in the original database are
only classified as “phishing” and “unknown”. We will classify the phishing sites in PhishTank
database into more precise categories and used them in the proposed project. PhishTank Has
been working effectively to fight against phishing attacks, thousands of phishing links are
monthly detected and verified as valid phishing sites [9], using the public’s effort and
contribution to build a trustworthy and dependable database that is open for everyone to use and
share. As result of that several well know organizations and browsers started using PhishTank
database such as Yahoo mail, Opera, MaCafee, Mozilla Firefox [10]. In my prototype, I use
PhishTank as a phishing URLs’ provider.
In the paper titled “Large-Scale Automatic Classification of Phishing Pages [2]”, Colin
Whittaker, Brian Ryner, and Marria Nazif proposed an automatic classifier to detect phishing
websites. The classifier maintains Google’s phishing blacklist automatically and analyzes
millions of pages a day including examining the URL and the contents to verify whether the page
is phishing or not. The paper proposed a classifier works automatically with large-scale system
which will maintain a false positive rate below 0.1% and reduce the life time of phishing page.
They used machine learning technique to analyze the web page content. In my project, the
determination is based on Phishtank’s blacklist, However; I aim to propose a methodology for
classification the phishing website. My ultimate goal is not to determine whether the page
phishing or not, PhishLurk determines depending on Phishtank’s blacklist, but to provide a new
method to classify phishing links and considering two factors: consuming as less memory and
screen space as possible which eventually improve the overall classification efficiency.
In the paper titled “PhishGuard: A Browser Plug-in for Protection from Phishing [8],
Joshi, Y. Saklikar, S. Das, D. Saha, proposed a mechanism to detect a forged website via
submitting fake credentials before the actual credentials during the login process of a website,
then the server-side analyzes the responses of the submissions of all those credentials to
determine whether the website is phishing or not. The mechanism was implemented on browsers
side “user-side” as plug-in of Mozilla FireFox, However; the mechanism only detects during the
log-in process for a user. If another user log-in to the same phishing website, he will goes
through the same detection process. In my project, if the website reported as phishing site, no
other user can get access, the reported link will be blocked, to the reported website.
In the paper titled “BogusBiter: A Transparent Protection Against Phishing Attacks [17]”
Chuan Yue and Haining Wang proposed a client-side tool called BogusBiter that send a large
number of bogus credentials to suspected phishing sites and hides the real credentials from
phishers . BogusBiter is unique it also helps legitimate web sites to detect stolen credentials a
timely manner by having the phisher to verify the credentials he has collected at that legitimate
web site. BogusBiter was implanted as Firefox 2 extension , however; My project is different
that uses server side to provide the protection .
Most popular browsers provides a phishing filter that warns users from malicious
websites including phishing websites. Filters mainly depend on certain lists to detect the
malicious websites. IE7 used “Phishing Filter” that has been improved to be SmartScreen Filter
in later version of IE due to the weak protection phishing filter provides[15]. In IE 8 and IE 9
"SmartScreen Filter" verifies the visited websites based on the updated list of malicious websites
that Microsoft created and updated continuously [11] [12]. Similar to IE, Safari browser has
filters checks the websites while the user browsing against a list of phishing sites. After the
warning of Paypal to its members that Safari is not safe for their service [13], Safari started to
use an extended validation certificates to support analyzing websites [14]. In order to have a safe
browsing, Safari’s users need to use both. Firefox earlier versions of Firefox take advantage of
ant-phishing companies such as GeoTrust or the Phish-Tank, using their list to support
identifying malicious websites. The current version of Firefox has adopted Google's antiphishing program to support its phishing protection.
Many research projects have proposed mechanisms that implemented as browser plugs-in
and tool-bar against phishing attack. The main problem with plugs-in and tool bar is the need for
users’ cooperation. Users may not cooperate and install the tool. Some users occasionally prefer
to turn their filter off to brows faster[16]. Plugs-in and tool bar in some devices may not be as
effective as it in desktop browser due to the limitation in the performance and the screen space as
the case in smartphones. PhishIurk’s mechanism is aimed to use as less space and memory as
possible in the Client-side, using the server side to provide the classification and protection of
phishing links. So even the phishing protection was disabled on client-side PhishIurk still
provide protected and classified links to the user.
Proposed Project
I propose a mechanism to protect the user from phishing attacks, the mechanism assesses
and classifies the sites, based on Phishtank’s blacklist, from the server side and using color
scheme. The system also utilizes less screen space and memory to be work even with small sizes
devices. The mechanism classifies the links into four types by using coloring scheme that use
less space and requires less memory. I expanded the classification that used in Phishtank to be as
following :

Phishing link (Red): is an absolute phishing link. The link will be disabled, so even if the
user is ignorant or surfing carelessly as we saw in the survey [1], there is no way to
access the link.

Unknown link (Orange): suspicious link, it might potentially be phishing link, it could be
link indicate the same name or part of a real company's name asking the user to provide
sensitive information. The link is submitted as phishing link but it hasn’t be verified yet.
The user can click and get access to this type in their responsibility. The user gets warned
before accessing the link.

Unlikely link (Gray): The same as unknown link, the difference is when the black list get
a report about link that unlikely to be a phishing link for example websites that have TopLevel Domain “TLD” ends with (.edu or .gov), they are unlikely to be used by hackers
website because their specialized for official use of organizations. The link will maintain
to be unlikely until gets verified by Phishtank. Note that it might be someone reported the
unlikely site trying to denigrate the organizations, it is fair to maintain the unlikely status
until it gets verified and changed to a Safe link, or the site might actually be attacked by
Cross-site scripting attacks or SQL injection attack.
Global Phishing Survey: Trends and Domain Name Use - April 2011
As we see in the above chart, 60% phishing attacks was lunched by TLDs: .COM, .NET,
.TK, and .CC.

Safe Link (Green): These are safe links, totally not phishing. The user can access the link
without triggering warning messages.
Providing the protection from the server side and using the coloring scheme for classification
would safe much memory and more space on the client-side. The mechanism determines whether
the website is phishing or not based on provided black-list of phishing website that is
periodically updated to achieve the possible maximum accuracy.
The plan
In this project, I will develop an anti-phishing search web site called “PhishLurk” using PHP and
CSS that responds to the user search inquires with classified protective links.
In case the website was a phishing link, the engine would classify it as risky, disable it, and
warn the user by producing a red link.
If the link was classified as “unknown” or “suspicious”, it would give users the choice
whether to access the link or not, and warn them about the impact or consequences.
If the link was classified “unlikely”, it would give the user the choice whether to access the
link or not and warning to take the responsibility and warn that the link unlikely to be phishing,
the link might be hacked or there is someone try to denigrate the organizations of the website.
The last case when the link has no risk or suspicious note, the engine would classify it as a
safe link. I use CSS to help classifying the links because it doesn’t consume a lot of screen
resources or demand extensive computation.
Beside processing the classification and providing the safe results to the user, PhishLurk
system reads and updates the blacklist periodically from PhishTank.com to have the most up-todate results.
.
PhishLurk’s Design
Metric for Evaluating the PhishLurk System
The proposed PhishLurk system can be evaluated by examining the effectiveness of its usage by
the users and the processing overhead. We will conduct a survey on the usage of PhishLurk and
summarize the feedbacks. Stress tests will be performed on the system and collecting the
statistics about the average processing time overheads for classifying the url, and modifying the
links.
Deliverables

The working software prototype, PhishLurk, with user guide and installation manual.

A master report documenting the design and implementation of PhishLurk,
implementation choices and their performance evaluation, and the lessons learned.
References:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
Rachna Dhamija, J. D. Tygar, and Marti Hearst. 2006. Why phishing works. In Proceedings of the SIGCHI conference on
Human Factors in computing systems (CHI '06), Rebecca Grinter, Thomas Rodden, Paul Aoki, Ed Cutrell, Robin Jeffries, and
Gary Olson (Eds.). ACM, New York, NY, USA, 581-590. DOI=10.1145/1124772.1124861
http://doi.acm.org/10.1145/1124772.1124861.
Aaron Blum, Brad Wardman, Thamar Solorio, and Gary Warner. 2010. Lexical feature based phishing URL detection using
online learning. In <em>Proceedings of the 3rd ACM workshop on Artificial intelligence and security</em> (AISec '10).
ACM, New York, NY, USA, 54-60. DOI=10.1145/1866423.1866434 http://doi.acm.org/10.1145/1866423.1866434
Gross, Ben. "Smartphone Anti-Phishing Protection Leaves Much to Be Desired | Messaging News." Messaging News | The
Technology of Email and Instant Messaging. 26 Feb. 2010. Web. <http://www.messagingnews.com/story/smartphone-antiphishing-protection-leaves-much-be-desired>.
ComScore, Inc. "Smartphone Subscribers Now Comprise Majority of Mobile Browser and Application Users in U.S."
ComScore, Inc. - Measuring the Digital World. ComScore, Inc, 1 Oct. 2010.
<http://www.comscore.com/Press_Events/Press_Releases/2010/10/Smartphone_Subscribers_Now_Comprise_Majority_of_Mo
bile_Browser_and_Application_Users_in_U.S>.
Entner, Roger. "Smartphones to Overtake Feature Phones in U.S. by 2011." Http://www.nielsen.com. Nielsen Wire, 26 Mar.
2010. Web. <http://blog.nielsen.com/nielsenwire/consumer/smartphones-to-overtake-feature-phones-in-u-s-by-2011/>.
Kerstein, Paul L. "How Can We Stop Phishing and Pharming Scams?" CSO Online - Security and Risk. CSO Magazine Security and Risk, 19 July 2005. Web. <http://www.csoonline.com/article/220491/how-can-we-stop-phishing-and-pharmingscams->.
OpenDNS, LLC. PhishTank: an Anti-phishing Site. [Online]. http://www.phishtank.com.
Joshi, Y.; Saklikar, S.; Das, D.; Saha, S.; , "PhishGuard: A browser plug-in for protection from phishing," Internet
Multimedia Services Architecture and Applications, 2008. IMSAA 2008. 2nd International Conference on , vol., no., pp.1-6,
10-12 Dec. 2008 doi: 10.1109/IMSAA.2008.4753929, URL:
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4753929&isnumber=4753904
PhishTank - Statistics about phishing activity and PhishTank usage , http://www.phishtank.com/stats.php
PhishTank, Friends of PhishTank, http://www.phishtank.com/friends.php
SmartScreen Filter: Frequently Asked Questions." Windows Home - Microsoft Windows. <http://windows.microsoft.com/enUS/windows7/SmartScreen-Filter-frequently-asked-questions-IE9>.
"SmartScreen Filter - Microsoft Windows." Windows Home - Microsoft Windows. Web. <http://windows.microsoft.com/enUS/internet-explorer/products/ie-9/features/smartscreen-filter>.
Apple - Safari - Learn about the Features Available in Safari." Apple. <http://www.apple.com/ca/safari/features.html>.
TECH.BLORGE- Top Technology news, Paypal warns buyers to avoid Safari browser from Apple - <
http://tech.blorge.com/Structure:%20/2008/02/28/paypal-warns-buyers-to-avoid-safari-browser-from-apple/ >
"Firefox 2 Phishing Protection Effectiveness Testing." Home of the Mozilla Project.
<http://www.mozilla.org/security/phishing-test.html>.
"AVIRA News - Anti-Virus Users Are Restless, Avira Survey Finds." Antivirus Software Solutions for Home and for
Business. <http://www.avira.com/en/press-details/nid/482/>.
Chuan Yue and Haining Wang. 2010. BogusBiter: A transparent protection against phishing attacks. ACM Trans. Internet
Technol. 10, 2, Article 6 (June 2010), 31 pages. DOI=10.1145/1754393.1754395 http://doi.acm.org/10.1145/1754393.1754395
Download