MobiFish: A Lightweight Anti-Phishing Scheme for Mobile Phones 1 Longfei Wu, Xiaojiang Du, and Jie Wu Department of Computer and Information Sciences Temple University, Philadelphia, PA, 19122, USA 4/8/2015 Presenter: Dr. Xiaojiang (James) Du Phishing Attacks 2 Phishing attacks aim to steal private information such as usernames, passwords, and credit card details by impersonating a legitimate entity. Although security researchers have proposed many anti-phishing schemes, phishing attacks’ threat has not been well mitigated: Phishing sites expire and revive rapidly (Avg. 4.5 days). Attackers keep improving their techniques to circumvent existing anti-phishing tools. Mobile users are accustomed to being requested and providing credentials without checking the website. 4/8/2015 Phishing Attacks 3 Most targeted Industry Sectors 4/8/2015 Phishing Attacks Cont. 4 Almost all phishing attacks on PC are in the form of bogus websites. Current browsers on PC are embedded with antiphishing tools that can achieve a detection rate of over 90%. However, during the adaptation to hardware-constrained mobile platforms, browsers abandoned or truncated many features and useful functions (like anti-phishing). Open the same phishing site with Chrome on PC and Chrome for Android Mobile Phishing Attacks 5 Mobile Web Phishing Mobile phishing is an emerging threat targeting at mobile users of financial institutions, online shopping and social networking companies. Mobile App Phishing Some attackers develop fake applications (Apps) or repackage legitimate Apps, then upload these phishing Apps to unofficial app markets. It is harder to detect Phishing Apps than Phishing on mobile web pages. (Information can be retrieved from Html source code in webpages). The trend of launching phishing attacks on mobile devices can be attributed to hardware limitations such as small screen size, and the inconvenience of user input and application 4/8/2015 switching. Existing Phishing Detection Schemes 6 Current web phishing detection schemes can be roughly divided into two categories: heuristics-based schemes and blacklist-based schemes. Blacklist-based schemes can only detect phishing sites that are in the blacklist but can not detect zero-day phishing attacks. Heuristics-based schemes largely depend on features extracted from URL and HTML source code, and other techniques like machine learning are used to determine the validity. However, we find that features extracted from HTML source code could be inaccurate and phishing sites can circumvent those heuristics. There is no off-the-shelf tool to detect phishing Apps on mobile platform. 4/8/2015 Our Solutions and Contributions 7 We propose MobiFish, a novel automated lightweight antiphishing scheme for mobile phones. It is able to defend against both phishing webpages and Apps. Find the weakness of previous heuristics-based security schemes for webpage phishing, and develop a lightweight solution that utilizes optical character recognition (OCR) without reliance on HTML source code, search engine or machine learning techniques. Implement MobiFish on Google Nexus 4 smartphone running Android 4.2 operating system. Evaluate MobiFish with 100 phishing URLs and corresponding legitimate URLs, as well as “Facebook” phishing Apps. 4/8/2015 Mobile Webpage Phishing Attacks 8 Mobile user interface increases the vulnerability to mobile phishing attacks. Due to the small display size of phone screens, most mobile browsers have to remove the status bar and hide the URL bar once the web page finishes loading. Even during the loading process, long URLs are truncated to fit the browser frame. Since the ability to read and verify URLs is crucial in detecting phishing attacks, partial URL or even URL displayed with partial domain name would certainly increase the risk of being spoofed by phishing attacks. 4/8/2015 Mobile Application Phishing Attacks 9 Application-oriented phishing attacks can be categorized into two types based on the way they launch: Some phishing apps attempt to hijack existing legitimate targets. Another type of phishing apps directly appears as the target app. They keep performing task polling, and launch themselves as long as they detect the launch of target apps. As the result, the fake login interface covers on top of the real one, and the phishing app pretends to be the target app. This may occur when user downloads fake apps from unofficial app markets. The mobile App phishing attack ends with transmission of credentials to the attacker. Hence, blocking the transmission can effectively defend the attack. 4/8/2015 Overview of MobiFish Scheme 10 Phishing attackers apply fancy tactics to direct victims to their phishing sites or applications, which masquerade as trustworthy entities. The key to solve phishing problem is to find the discrepancy between the identity it claims and the actual identity. MobiFish consists of two independent components designed for mobile webpages and mobile applications WebFish and AppFish. 4/8/2015 Design of WebFish 11 We find that information extracted from HTML source code may not reflect the webpage displayed to users, since attackers can add texts, images and links into HTML source code while making any “undesirable” content invisible, by simply changing their size or covering them with other images. Hence, features like word frequency, brand name and company logo could be easily manipulated. The claimed identity should be extracted from the screen presented to a user. The actual identity can be obtained from the web address (or network connection). 4/8/2015 Identity Extraction 12 The claimed identity is extracted from a screenshot. Most login interfaces of legitimate mobile sites and apps are very simple. The entire login page or the majority of page can be captured in one screenshot. To obtain claimed identity from a screenshot, OCR technique is utilized to convert image into text. We use Tesseract, one of the most accurate open source OCR engines. The actual identity is obtained from the web address. Most enterprises use brand name as the second-level domain name (SLD) of their official websites. In cases that brand names are not exactly the same as SLD (e.g. brand name “AT&T” and SLD “att”), we build a whitelist that records common pairs of inconsistent brand name and SLD. brand name “AT&T” is directly mapped to SLD “att”, and vice versa. 4/8/2015 Identity Extraction Cont. 13 OCR Experiments Our testing uses a Thinkpad T420 laptop (2.40GHz, 4GB RAM) with pixel density of 131 dpi and a Google Nexus 4 smartphone (1.5GHz, 2GB RAM) with 320 dpi pixel density. We open the Ebay mobile login page in both mobile and PC browsers, each captures a screenshot. Then, Tesseract is used to extract text from phone screenshot while Microsoft Office Document Imaging (MODI) is used for the screenshot on PC. Tesseract MODI Tesseract only takes 1.6 seconds while MODI uses 4.5 seconds. 4/8/2015 Design of WebFish Cont. 14 Finally, WebFish compare the claimed identity with the actual identity. 4/8/2015 Design of WebFish Cont. 15 The key idea of WebFish to detect a phishing URL is that the SLD is not among the text extracted from the screenshot of the login page. As far as we know, no phishing site uses common terms in login pages like “sign”, “username”, “password” or “welcome” as SLD. It is not likely for well constructed and maintained legitimate web pages to have strange words. If the actual domain name of a phishing site appears in the login page of fake websites, users can easily spot it and check the URL to verify the validity of the webpage. If the attacker includes the phishing domain name in the screen in a tiny font size, then OCR is not able to recognize it either and WebFish will still mark it as a phishing site. 4/8/2015 Design of AppFish 16 AppFish maintains a database called suspicious app set (SAS), which contains profiles of untrusted apps including user ID (Uid), launching time and screenshot text. These apps should be: Specified for one company. This is to ensure that the app only connects to the company’s official sites or affiliated (partners) servers. The domain name of collaborators are pre-checked and added to the SAS profile in advance. (e.g. Facebook and its content delivery networks) Have user login. There are lots of apps that do not need users to login, in which App phishing attacks would not happen at all. (e.g. apps for news, games, music or map) 4/8/2015 Design of AppFish Cont. 17 Phishing apps are not able to load valid following pages. Users will suspect their validity in a short time. AppFish monitors the possible paths that allow a phishing app to transmit data to outside, Hence, a phishing app can only send out user credentials during a short period (denoted as T) after user clicks the phishing page. which include socket, HttpGet/HttpPost, SMS, email (email is based on socket), etc. AppFish rules: The SLD name of the Http connection destination has to be in the text or affiliated domain names stored in SAS profile. Socket and SMS function could be blocked for a period of time, which should be long enough for user to notice (and uninstall) the phishing app. 4/8/2015 Design of AppFish Cont. 18 The AppFish defense scheme works in two phases: launching phase and authentication phase. 4/8/2015 Performance Evaluation 19 We implement MobiFish on a Nexus 4 smartphone. We modify the source code of Android 4.2 system so that it is able to support MobiFish. Experiments with WebFish We randomly pick up 100 phishing URLs from PhishTank.com. Most of them are highly similar to their legitimate counterparts. The input forms in phishing login pages are often surrounded by brand names or company logos as the legitimate login pages. When loading a large conventional web page, mobile browsers often display the area that contains the input form instead of displaying an overview of the entire web page. 4/8/2015 Performance Evaluation Cont. 20 WebFish is able to detect all the phishing webpages and achieves 100% verification rate of legitimate URLs 4/8/2015 Performance Evaluation Cont. 21 Experiments with AppFish There are only a few reported phishing apps and none of them is available online. To test the effectiveness of AppFish, we develop two sample phishing apps: one can hijack real Facebook app and the other appears as “Facebook”. After user clicks the “Log in” button, the fake apps send the credentials to our server by HttpGet, HttpPost, socket, SMS, and email, respectively. AppFish can block all the connections and warn users about the phishing attempts. 4/8/2015 Conclusion 22 We proposed MobiFish, a novel lightweight mobile phishing defense scheme. MobiFish uses OCR, which can accurately extract text from the screenshot of mobile login interface so that the claimed identity is obtained. Mobile phones have higher dpi than PC. Compared to existing OCR-based anti-phishing schemes (designed for PC only), Mobifish is lightweight and it works without using external search engines or machine learning algorithms. We implemented MobiFish on a Google Nexus 4 smartphone, and conduct experiments, which show that MobiFish and AppFish can effectively detect and defend against mobile phishing attacks. 4/8/2015 Thank You! 23 Prof. Xiaojiang (James) Du Dept. of Computer and Information Sciences Temple University Philadelphia, PA, 19122, USA Email: dux@temple.edu Web: www.cis.temple.edu/~xjdu 4/8/2015