Remote Virtual Machine Monitor Detection Jason Franklin, Mark Luk, Jonathan McCune, Arvind Seshadri, Adrian Perrig, Leendert van Doorn Remote Virtual Machine Monitor Detection Are you virtual? ` External Verifier Remote Machine Problem Statement • Determine if a remote machine is virtual or real Challenges • VMM provides an accurate abstraction of the underlying hardware • VMM controls execution of code and may return arbitrary values VMM Detection and Botnets (1/2) Scenario 1 • Bots may install a stealthy virtual machine based rootkit (VMBR) to avoid detection by traditional malware scanners • Stealthy rootkits prevent administered machines from removing bots • You run an AV, update, patch, yet never locate/remove the bot • Detecting VMMs allows us to detect bots VMM Detection and Botnets (2/2) Scenario 2 • Bots may check for the existence of a VMM in order to prevent dynamic analysis • “Detecting the sandbox” • Real threat & mentioned several times yesterday • Agobot uses a heuristic to check for VMWare • Studying VMM detection helps us understand how to enable VMM-based dynamic analysis State of the Art in VMM Detection Check for software-implementation artifacts • Redpill checks the location of the IDT (different location under VMWare) • VMWare’s Back checks for VMWare I/O port Other approaches • Make restrictive assumptions • Easy to thwart • Require benchmarking Our Goals Develop a VMM detection algorithm: • VMM implementation independent • Accurate • Practical/relies on few assumptions Leverage fundamental differences between virtual and real machines VMM Model Popek and Goldberg ’74 formally defined the properties a control program must satisfy to be deemed a VMM • Efficiency Property • Resource Control Property • Equivalence Property • Program execution in a virtual environment must be indistinguishable from execution in a real environment Indistinguishable? Oh no! If a program executes indistinguishably, we can’t detect a virtual execution environment Don’t worry! There are exceptions to the equivalence property • Timing dependency exception • Certain sequences of instructions may take longer to execute • Resource availability exception Does the timing dependency exception necessarily exist? Empirically, yes. • Programs executing in a VMM experience VMM overhead In theory, yes. • Intuition is that VMM must maintain control of executing code by interposing on the operations or rewrite the binary Exploiting the timing dependency exception to detect a VMM Algorithm: Given: • Real machine R with configuration C e.g., C={Pentium IV, 2.0GHz} • Remote machine M with configuration C • Program P with control-modifying instructions 1: Time the execution of P on R and store the value in r 2: Time the execution of P on M and store the value in m 3: IF m > r + k THEN M is virtual [note: k is the detection constant] 4: ELSE M is real Tasks Remaining Achieve accurate high-integrity execution timing Construct program P with externally noticeable VMM overhead Determine configuration of remote machine Determine detection constant k Accurate High-Integrity Execution Timing Can’t trust the integrity of the timing measurements returned by the VMM Use an external source of time (e.g., remote machine, watch, etc…) Constructing P with VMM Overhead P is a sequence of sensitive (potentially control modifying) instructions that requires VMM interposition P is designed to invoke VMM overhead Design decisions in developing P include: • Sensitive instruction selection • Number of instructions Selecting Sensitive Instructions R/W cr3 R/W cr2 R/W cr0 cli Number of Instructions in P Assume we have complete configuration information for remote machine M Easy to determine the number of instructions required to overcome experimental noise • Variance in execution time • Variance in network latency Complete Configuration Information Fastest VMM = FV(x) Real Machine = RM(x) Given an estimate of the noise N in the environment (i.e., 10 ms variation in network latency) Select x s.t. FV(x) – RM(x) >> N Incomplete Configuration Information Unreasonable to assume complete configuration information is available for a remote machine Use “hardware discovery” heuristic • Intuition: certain properties of the underlying hardware are difficult to mask through the VMM and are unique to a particular architecture • Discovering these hardware artifacts gives us partial configuration information about a remote machine Incomplete Configuration Information Given a subset C’ of the complete configuration information C • C = {Pentium IV, 2.0 GHz} and C’ = {Pentium IV} Bound the execution time of P on the fastest and slowest machines that satisfy C’ • Works because P is CPU bound • We can time the execution of P on a x GHz machine and then use the ratio of the fastest and slowest machines to bound the execution times Hardware Discovery on the Pentium IV P4 has a unique trace cache which “shines” through the VMM With sequences of register-to-register arithmetic instructions without data hazards populate the trace cache of the Intel Pentium IV, a CPI of 1/3 is attainable Once an instruction sequence exceeds the trace cache’s size of 12KB, the CPI becomes 1 Remote Trace Cache Discovery 11264 instructions fit in the trace cache 11328 instructions exceeds the size of the trace cache A considerable jump in overhead occurs when the trace cache overflows Putting it All Together Remotely timed overhead from reading and writing x86 Control Register 3 multiple times consecutively Despite not being included in our analysis, remote detection works against a machine running Xen with hardware virtualization support (HVM Xen) • We conclude that hardware virtualization support is not sufficient to prevent VMM detection Detection Algorithm Limitations VMM could tamper with execution of detection code • Countermeasure: Leverage software-based attestation (Pioneer) VMM could prevent communication to external timer • Countermeasure: Containment policy-based detection Receive incorrect response from hardware discovery heuristic VMM may be incorporated with OS • Malware can still own the lowest layer • Virtual-machine-based rootkits are a threat today Conclusion Developed a remote VMM detection algorithm • Attempts to be independent of VMM software implementation details • Practical/relies on fewer assumptions than previous schemes • Accurate, configurable, and effective over the Internet Hardware virtualization support is not sufficient to mask differences between real and virtual environments