Behavioral Detection of Malware on Mobile Handsets Abhijit Bose, Xin Hu, Kang G. Shin, Taejoon Park Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware CISC 879 - Machine Learning for Solving Systems Problems Malware on Mobile Handsets Like PC’s Mobile Handsets are becoming more intelligent and complex in functionality Exposure to malicious programs and risks increase with the new capabilities of handsets Cabir, the first mobile worm appeared in June 2004 WinCE.Duts, the Windows CE virus was the first file injector on mobile handsets capable of infecting all the executables in the device’s root directory CISC 879 - Machine Learning for Solving Systems Problems Limitations of current anti-virus solutions for mobile devices Rely primarily on signature-based detection Useful mostly for post-infection cleanup Example: Scan the system directory for the presence of files with specific extension .APP, .RSC and .MLD in Symbian-based devices Due to differences between mobile and traditional environments desktop CISC 879 - Machine Learning for Solving Systems Problems Why conventional anti-virus solutions are less efficient for mobile devices? Mobile devices generally have limited resources such as CPU, memory, and battery power Most published studies on the detection of internet malware focus on their network signatures Mobile OSes have important differences in the way file permissions and modifications to the OS are handled CISC 879 - Machine Learning for Solving Systems Problems Goal Develop a detection framework that Overcomes the limitations of signature based detection Address the unique features and constraints of mobile handsets CISC 879 - Machine Learning for Solving Systems Problems Approach Behavioral detection approach is used to detect malware on mobile handsets CISC 879 - Machine Learning for Solving Systems Problems Behavioral Detection Run-time behavior of an application is monitored and compared against malicious and/or normal behavior profiles More resilient to polymorphic worms and code obfuscation Database of behavior profiles is much smaller than that needed for storing signature-based profiles Suitable for resource limited handsets Has potential for detecting new malware CISC 879 - Machine Learning for Solving Systems Problems System Overview CISC 879 - Machine Learning for Solving Systems Problems Malicious Behavior Signatures Behavior Signature: Manifestation of a specification of resource accesses and events generated by applications It is not sufficient to monitor a single event of a process in isolation in order to classify an activity to be malicious Temporal Pattern: The precedence order of the events and resource accesses, is the key to detect malicious intent CISC 879 - Machine Learning for Solving Systems Problems Temporal Patterns - Example Consider a simple file transfer by calling the Bluetooth OBEX system call in Symbian OS On their own, any such call will appear harmless Temporal Pattern: (received file is of type .SIS) and (that file is executed later) and (installer process seeks to overwrite files in the system directory) CISC 879 - Machine Learning for Solving Systems Problems Representation of Malicious Behavior Simple Behavior: ordering the corresponding actions using a vector clock and applying the “and” operator to the actions Complex Behavior: specified using temporal logic instead of classical propositional logic Specification language of TLCK(Temporal Logic of Causal Knowledge) is used to represent malicious behaviors within the context of a handset environment CISC 879 - Machine Learning for Solving Systems Problems Behavior Signature A finite set of propositional variables interposed using TLCK Each variable (when true) confirms the execution of either - A single or an aggregation of system calls - An event such as read/write access to a given file descriptor, directory structure or memory location PS = {p1, p2, ・ ・ ・ , pm} U {i|i ∈ N} CISC 879 - Machine Learning for Solving Systems Problems Operators used to define Malicious Behavior Logical Operators: Temporal Operators: CISC 879 - Machine Learning for Solving Systems Problems Example: Commwarrior Worm – Behavior Signature CISC 879 - Machine Learning for Solving Systems Problems Atomic Propositional Variables CISC 879 - Machine Learning for Solving Systems Problems Higher Level Signatures Harmless Signatures: Harmful Signatures: CISC 879 - Machine Learning for Solving Systems Problems Generalized Behavior Signatures Studied more than 25 distinct families of mobile viruses and worms targeting the Symbian OS Extracted most common signature elements and a database was created Malware actions were placed were placed into 3 categories: - User Data Integrity - System Data Integrity - Trojan-like Actions CISC 879 - Machine Learning for Solving Systems Problems Run-Time Construction of Behavior Signatures Proxy DLL to capture API call arguments CISC 879 - Machine Learning for Solving Systems Problems Major Components of Monitoring System CISC 879 - Machine Learning for Solving Systems Problems Behavior Classification By Machine Learning Algorithm Behavior signatures for the complete life cycle of malware are placed in the behavior database for run-time classification To activate early response mechanisms, malicious behavior database must also contain partial signatures that have a high probability of eventually manifesting as malicious behavior Behavior detection system can detect even new malware or variants of existing malware, whose behavior is only partially matched with the signatures in the database SVM is used to classify partial behavior signatures from the training data of both normal and malicious applications CISC 879 - Machine Learning for Solving Systems Problems Possible Evasions Program behavior can be obfuscated by: Behavior reordering File or directory renaming Normal behavior insertion Equivalent behavior replacement CISC 879 - Machine Learning for Solving Systems Problems Limitations The detection might fail if most behaviors of a mobile malware are completely new or the same as normal programs The system can be circumvented by malware that can bypass the API monitoring or modify the framework configuration CISC 879 - Machine Learning for Solving Systems Problems Evaluation Monitor agent (platform dependent) and Behavior detection agent (platform independent) is evaluated Program behavior is emulated and then tested against real-world worms 5 malware applications (Cabir, Mabir, Lasco, Commwarrior, and a generic worm that spreads by sending messages via MMS and Bluetooth) and 3 legitimate applications (Bluetooth OBEX file transfer, MMS client, and the MakeSIS utility in Symbian OS) were built Training Dataset Applications (Malwre + Legitimate) Set of Behavior Signatures Obtain Partial/ Full Signatures Remove Redundant Signatures Testing Dataset CISC 879 - Machine Learning for Solving Systems Problems Classification Accuracy of Known Worms CISC 879 - Machine Learning for Solving Systems Problems Detection Accuracy (%) of Unknown Worms CISC 879 - Machine Learning for Solving Systems Problems Evaluation with Real-world Mobile Worms Two Symbian worms, Cabir and Lasco are considered Behavior signatures are collected by compiling and running them on Symbian emulator - SVC achieved 100% detection of all worm instances Framework’s resilience to the variations and obfuscation is tested by considering the variants of Cabir - The variants are easily detectable as the behavioral detection abstracts away the name details CISC 879 - Machine Learning for Solving Systems Problems Conclusions Due to fewer signatures, the malware database is compact and can be place on a handset Can potentially detect new malware and their variants Behavioral detection results in high detection rates CISC 879 - Machine Learning for Solving Systems Problems Thank You CISC 879 - Machine Learning for Solving Systems Problems