Panorama: Capturing System-wide Information Flow for Malware Detection and Analysis Authors: Heng Yin, Dawn Song, Manuel Egele, Christoper Kruegel, and Engin Kirda Publication: Presenter: ACM Conference on Computer and Communications Security, 2007 Brad Mundt for CAP6133 Spring ‘08 Motivation Malicious software sneaks onto computers Collects users’ private information Causes havoc on Internet Slows performance Costs to remove Reputable vendors violate users’ privacy Google Desktop Sony Media Player Traditional Malware detection Signature-based Cannot detect new malware or variants Heuristics High false positives High false negatives The Panorama way Input Suspicious behavior Process Whole-system, fine-grained taint tracking Marking data Operating-system-aware taint analysis Inappropriate data access, stealthfully What touches the tainted data and how Output Taint Graphs Tracked tainted data Taint Graph Information flow that shows the process that accessed the tainted data Make policies based on Taint Graph Compare unknown samples against Taint Graph Automatic Numerous categories Taint Graph example Taint Graph generation Similar to a mapped out logic/process tree Conceptually, horizontal branching 9 different types of Root taint sources Text, password, http, https, icmp, ftp, document, and directory Non-root entries can be OS OS objects (processes, modules) resource (such as a file) System Overview Conceptual Structure Works with closed code Windows OS FireFox Monitors the whole system in a processor emulator Shadow memory stores taint status of Each byte of physical memory CPU’s general purpose registers Hard disk and network interface buffer Taint Sources Test information is inputted and marked as taint source Inputted from hardware such as Keyboard Network interface Hard disk Tainting at hardware level Malware could hook before input reaches the software Taint propagation Monitors CPU instructions and DMA operations dealing with tainted data OS-Aware taint tracking Developed a kernel module Authenticated communications to taint engine Code identification Identifying the code under analysis and it’s actions Entire code segment is labeled Dynamic or Encrypted code is labeled too A similar method labels trusted code Three categorized behaviors Anomalous information access MS Paint accessing passwords Anomalous information leakage BHO reporting home about surfed websites Excessive information access Repeatedly accessed directory to hide rootkit Malware detections 42 real-world malware samples 56 benign applications were tested Only 3 false positives, no false negatives 2 from a personal firewall 1 from a browser accelerator Summary A new system to detect malware System-Wide Taint tracking Taint Information Flow Data access and process tracking graphs Policies Contributions Unified approach to detect and analyze diverse malware Designed and developed a functional prototype Detected all malware samples Keystroke loggers, password sniffers, packet sniffers, stealth backdoors, rootkits, and spyware Weaknesses Performance Overhead Using Cygwin utilities Prototype is not optimized Slowdown average is 20 times Intended as a offline tool Evasive malware Time bombs Selective keystroke loggers Virtual environment detection How to Improve Optimize the code Automate taint graph analysis and policy implementation Virtual environment shielding Or switch out of emulated environment Implement mentioned improvements Unicode conversion- switch case issue The End Thank you…