CMV: Automatic Verification of Complete Mediation for Java Virtual Machines A. Prasad Sistla, V.N. Venkatakrishnan, M. Zhou, H. Branske University of Illinois at Chicago Introduction Java language and Java Platform Java Language: With powerful features, gained considerable success Java Platform: Enable users to write and run applications written in Java languages directly on top it Java Applications Java and Platforms are Used applications Everywhere Java compiler and other tools Almost everyJavabrowser JVM and supports libraries (includingruns java.io, java.net, Java Platform jave.lang and other packages) running Java Java Applets Virtual Machine (such as HotSpot, Harmony) Java Applet: A Java program that is run from inside OS (e.g. Solaris , Linux, Windows) a web browser TM An Example of Java Applet Internet Explorer browses a webpage containing a Java Applet Webpage content: <APPLET NAME="QuizMaster" CODE="QuizMaster.cl ass" ….> …. </APPLET> - From http://www.realapplets.com/applets/quizmaster/ Risks What can happen if a Java applet/Java program is allowed to access any system resources? Compromise the security of the local host Disclose the confidential information: Transmit information Destroy system files Display annoying pictures on a user screen How the JVM prevents this from happening? Java Fine-grained Access Control Model Security Policies Map code to sets of permissions to access security sensitive resources grant codeBase “URL” { } permission java.io.FilePermission “/home/tmp/", "read"; Runtime Monitoring in JVM Through a call to library class SecurityManager, before accessing any sensitive resources A Typical Use of SecurityManager public FileInputStream(File file) throws FileNotFoundException { String name = (file != null ? file.getPath() : null); // Existing Solutions Potential Problem: SecurityManager security = System.getSecurityManager(); // Challenge: checks for SM Without checking the permissions, any file specified by Ensure that thecan SecurityManager is consulted on all the application be opened, leading to potential if (security paths != null) that { lead to sensitive operations confidential information leaks. } } security.checkRead(name); …… open(name); method permission check sensitive operation implemented as a native Our Goal Ensure Java Standard Libraries are trustworthy They satisfy Complete Mediation Property: SecurityManager is consulted on all paths that lead to sensitive operations This Assurance will benefit millions of Java users Sensitive Operations Challenges to Our Goal: Implemented as native methods • Large code base - Thousands of classes in Java libraries Code that is specific to a hardware and operating system • Existing Model Checking tools (e.g. SLAM, MOPS) are general platform and are written in other languages, such as C and purpose oriented, while our goal is specialized to verify complete C++. Called by Java code. mediation property. These generic tool may not be scalable to a large code base such as the Java libraries Our Solution Input: Java Standard Libraries Security Property Verifier Output: No All paths from public methods to sensitive operations are guarded by security checks. An automated static analysis technique. Sound. Not Complete Yes Some paths from some public methods to sensitive operations are missing security checks. Our Contributions A scalable automated model checking technique that verifies satisfaction of complete mediation in open systems A tool, Complete Mediation Verifier (CMV) Efficiently analyzes JVM library classes Experimental results in two shrinkwrapped JVM implementations -- HotSpot and Harmony VMs. A Code Example to Walk through Various Verification Approaches public void X () { // do some operations private void Y() { if (….){ SecurityManager sm = System.getSecurityManager(); Y(); } if (sm != null) { sm.checkPermission( FilePermission(pathname”read”)); } //sensitive-operation FileRd1(); Risky } // sensitive-operation FileRd2(); A branch w/o a } security check else { // other operations } Analysis : Naive First Approach private void Y() { 0. if (….){ 1. SecurityManager sm = System.getSecurityManager(); 2. if (sm != null) { 3. sm.checkPermission( FilePermission(pathname, ”read”)); } // sensitive-operation 4. FileRd2(); } else { 5. // other operations } } Y0 Y1 Y5 Y2 Y3 safe Y4 RET First Approach (Contd.) - Presence of Method Calls Expand and analyze CFG inline to make Extended CFGs(ECFG) public void X(int x) { CFG(X) 0: // do some operations X0 Y Y0 X0 1: Y(); Invokes Y Y1 Y5 X1 Y2 //sensitive-operation Expand To 2: FileRd1(); RET } Inline Expansion Drawbacks: Recursion Size of ECFG(M) – can be exponential X2 Y3 Y4 X2 RET At risk Summarization High-level idea: Analyze methods once; “summarize” and reuse the results We store negations of these two properties All_Path_Secure No paths to return node(s) w/o security check. Otherwise, Insecure_Path EN RT Good No paths to unguarded sensitive operations. Otherwise, Bad EN Method Summary A 2-tuple to store <Insecure_Path, Bad> 4 kinds of values <insecure_path, bad> <insecure_path, ┴> <┴, bad> <┴, ┴> Observations to Compute Method Summary To To compute compute Insecure_path Bad attribute: attribute: A node that Invokes a Invokes a bad Insecure_Path method M method M11 Invokes a All_PathSecure and Good method M1 neutral node Aa sensitive node equivalent equivalent equivalent A security check node Second Approach – Computing Summaries of Methods Given two methods X & Y. X invokes Y. Step 1: Construct call graph Meth. X Y0 Y1 Y5 Meth. Y Step 2: Find out Y needs to be analyzed first; The summary of the method computed safe first can then be reused. In presence of multiple methods, list the methods in reverse topological order Y2 Y3 Y4 RET CFG(Y) Step 3: Analyze CFG(Y) to Compute Summary(Y) = <InSecure_Path, ┴ > Summary(Y) Second Approach (Contd.) Step 4: - Compute Summary(X) Invokes Y, Summary(Y) = <InSecure_Path, ┴ > X0 X0 At risk X1 X2 RET CFG(X) invocation Node X1 is equivalent to a neutral node X2 RET CFG(X) Summary(X) = <InSecure_Path, Bad> Second Approach - Algorithm Algorithm in summary: Construct call Graph; Get a list of methods in Reverse Topological Order(RTO); For each method M in the list ordered in RTO Compute Summary(M), reusing the method summaries of those invocation nodes; Runtime complexity: O(ΣM) ΣM the sum of the sizes of the CFGs of all the methods. Expanded Second Approach - For Recursion Let us say M1 and M2 are mutually recursive methods Meth. M1 Meth. M2 A cycle exists in the call graph Analysis based on reverse topological order will not work Solution: Repeat the previous algorithm, until all method summaries reach a fix point. Worst case run time: O(N* ΣM). Quadratic N the number of methods analyzed Quadratic algorithms still problematic for a large code base such as libraries How can we make this more efficient? Our Approach – A New Efficient Solution (Part 1 – Compute Insecure_Path) X0Main public voidIdea: X(int x) { Z0 Step 1: Add X0, Z0 to Q Q X0 Z0 private void Z() { Step 2: X0 is a neutral node. Invokes Z 0: SecurityManager sm = at Q Z0 method summaries for all methods, theX1 0: //Compute do some operations Add its successor X1 to Q X1 System.getSecurityManager(); same time Z1 1: 3: sm.checkPermission( Step Z0 is a neutral node. Q X1 Z1 For Insecure_Path summaries, process all the 1:X1 Z(); Z1 FilePermission(pathname, Add its successor Z1 to Q methods at the same time ”read”)); Start with RET entry RET//sensitive-operation nodes and keep adding Step in 4: a X1queue, is a method invocation Q Z1 //sensitive-operation successors to the queue as you explore node. Add X1 to WQ(Z). 2: FileRd1(); 2: FileRd2(); WQ(Z) X1 } } Linear run time complexity: O(ΣM) Step 5: Z1 is a security check. Q Stop exploring the path. WQ(Z) ΣM sum of sizes of the CFGs of all methods X1 Step 6: Q is empty. Terminate the algorithm. Both X and Z are All_Path_Secure. Our Approach - A New Efficient Solution (Part 2 – Compute Bad Summaries) Step 1: Add X0, Z0 to Q Q X0 Z0 Main Idea (Similar to the computing of Insecure_Path ) private void Z() { Z0 void X(int x) { Step 2:same X0 istime. a neutral node. process all Z the methods at the Invokes 0: SecurityManager sm = Q Z0 X1 0: // do some operations Add its successor X1 to Q X1Different in processing of each node in the queue Q System.getSecurityManager(); Z1 for a sensitive operation, label as bad 1: 3: sm.checkPermission( Step Z0 ismethod a neutralsummary node. Q X1 Z1 1:X1Z(); Z1 FilePermission(pathname, its successor Z1 to Q for a node invoking a Add method M’, ”read”)); If Insecure_Path(M’) is true, put all successors to the Q RET//sensitive-operation RET Step 4: X1 isof a method invocation Q Z1 //sensitive-operation Put the node to a waiting queue M’ node. Insecure_Path(Z) is false. Whenever the bad summary of M’ is computed, process 2: FileRd1(); 2: FileRd2(); X WQ(Z)the Add Method X to WQ(Z). node based on summary}of M’ } Step Q Linear run time complexity: O(5:ΣZ1 M)is a security check. X0 public Stop exploring the path. WQ(Z) O(ΣM) sum of sizes of the CFGs of all methods X Step 6: Q is empty. Terminate the algorithm. Both X and Z are Good. Our Improvement to the New Approach Current solution: Observations public void Z() { Constructs and explores CFG(Z) and CFG(Z) Our is to identify CFG(Y)paths from public 0: // dogoal some operations Computes Summary(Y) and Z0 methods to sensitive operations without At risk Summary(Z). security checks. //sensitive-operation Z1 Invokes Y However, to identify a risky 1: FileRd1(); Z2 path in Z, To achieve such a No goal, sometimes there need to explore the CFG(Y)is and 2: Y(); // Ytoiscompute a non-public method compute noRETneed someSummary(Y) method No need to compute Insecure_Path } summaries. attribute of Z How to make the algorithm even more efficient? Our Approach: On-the-fly Approach - A Even More Efficient Solution Start to process all public methods first By putting their entry nodes of CFGs to a queue While exploring the CFGs, if a method invocation node is encountered and is never visited before, add its entry node to the queue to explore its CFG. Whenever its Bad attribute or Insecure_Path attribute is computed, replace the node with a non-methodinvocation node based on its summary For a return node, label the method as Insecure_Path For a sensitive operation, label the method as Bad For a security check, stop exploring that path An Example to Illustrate the Onthe-fly Approach Step 1: Add Z0 to Q public void Z() { CFG(Z) 0: // do some operations Z0 At risk //sensitive-operation Z1 Invokes non 1: FileRd1(); public method Y Q Z0 Step 2: Z0 is a neutral node. Add its successors to Q Z1 Q Z2 2: Y(); RET // Y is a non-public } Step 3: Z1 is sensitive, label method Z as Bad method Q Step 4: Q is empty. Terminate the algorithm Runtime of On-the-fly Approach In comparison to the non-on-the-fly approach: No need to construct the Call Graphs to get all methods! Construct the CFG and compute the method summary of each method on demand. In worst case, linear run time complexity O(ΣM) ΣM sum of sizes of the CFGs of all methods Witness Generation & Analysis Witness - A call chain to show a counter example, i.e. An unguarded path to a sensitive operation, or An insecure path. java.net.URL: void <init>(URL, String) java.io.ObjectInputStream: java.lang.Object readObject Method format: <Declaring class Name>: <Return Type>:<MethodName(param list)> Bold highlights risky methods represents other risky methods java.net.URL: java.net.URLStreamHandler getURLStreamHandler(String) java.net.DatagramSocket: void createImpl() <java.lang.Class: java.lang.Class forName(String) java.io.ObjectInputStream: java.lang.Class resolveClass(ObjectStreamClass) java.lang.Class: java.lang.Class forName(String,boolean, ClassLoader) java.lang.Class: java.lang.Class forName0(String, boolean, ClassLoader) Experiment & Results HotSpot VM from SUN Microsystems, java.io.*; java.net.*; java.lang.Class; Total # of Meth. JVM Class In JVM class 22 775 Concrete Total Meth. 703 1520 LOBC # of Risky Meth. Real Risky 23,394 61 0 Harmony VM from Apache Software Foundation, java.io.*; java.net.*; java.lang.Class Total # of Meth. JVM Class In JVM class 21 748 Concrete Total Meth 689 3928 LOBC # of Risky Meth. Real Risky 65,362 0 0 Risky method: A public and bad method. Real Risky: a risky method that has at least one feasible bad path in practice. Experiment & Results (Contd.) The average time taken by CMV to analyze each class was 74 seconds The bulk of the time is spent on Call Graph construction Preloaded method summary database storing summaries Risky methods: Need further analysis In HotSpot VM, only 61 methods to be analyzed, vs. code review on 1520 methods, a reduction in two orders of magnitude! Automated witness analysis: further reduce human review efforts Future Work #1 - Method Overriding A method in base class can be overridden by child classes For a method invocation node, method summaries from base and all child classes should be considered. Our Solution Transform the code of base class by adding the n- way branch statement before the first statement, to make a call to each overriding method of child classes. Method summary in each branch is computed Future Work #2 - Exceptions Code can throw exceptions. Control flow is changed Issues Throw statement If catch block exists, add an edge from throw to catch block Add an edge from throw to return node Method invocation node Similar to Throw statement Rethrow statement, i.e. a throw in a catch block Replace the node with a return node Solutions: Add new edges Future Work #3 - Check Non-native Sensitive Operations Sensitive operations are not implemented as native methods. For examples: Setting a Sensitive Private Data Member Returning a Sensitive Private Data Member Returning a Password Returning Network IP Address Solutions Identify these sensitive operations, and then apply the same approach (to the handle of sensitive native methods), to compute method summaries. Future Work #4 - Automated Sensitive Operation Detection To automate the detections of sensitive operations in Java libraries. Challenges- Hard to identify them due to: Some native methods are non-sensitive They do not access sensitive system resources The parameters passed to a sensitive operation: another factor (in addition to the method name) to determine if a method is a sensitive operation and the sensitive type (e.g. read or write a file) Future Work #5 - Automated Permission Checks Detection Determine the types of permission checks for each sensitive operation. Challenges Needs to first identify the sensitive operations in JVM. Should also determine the permission checks for those sensitive operations performed in Java libraries, but without permission checks, due to an undetected defect. Future Work #6 - Certification Given a method say M0 that is claimed to be Good or All_Path_Secure, verification is needed to ascertain its correctness. Our Solution We have initiated a novel and scalable verification technique to certify the correctness of method summaries. Experiments are ongoing, and we will report our work in the future. Related work Model checking Several general purpose: MOPS, SLAM, Bandera, C-Wolf Ours is specialized for this problem; property-specific customization makes it scalable Jensen et al [IEEE S&P99] Algorithms for verification of closed systems Our approach checks open systems (e.g., libraries) Static analysis Bug-finding [METAL, CQUAL] Checking complete mediation [Zhang et al [Security’03], Fraser [PLAS06] These techniques require non-trivial transformations to Java-like libraries E.g. security checks and sensitive ops in different methods Retrofitting code for authorization Naccio, SASI, Ganapathy et al [IEEE S&P06] One requires retrofitting only when verification reports unsafe methods Tool & Paper Tool: Complete Mediation Verifier (CMV) Developed. To be released by Dec. 2008 Paper: CMV: Automatic Verification of Complete Mediation for Java Virtual Machines Accepted by ACM Symposium on Information, Computer and Communications Security (ASIACCS’08), to be held in Tokyo, Japan. March 18-20, 2008 Collaborators: Hilary Branske Prof. Prasad Sistla Prof. V.N.Venkatakrishnan Thank you!