Specifying Multithreaded Java semantics for Program Verification Abhik Roychoudhury National University of Singapore (Joint work with Tulika Mitra) Java Multithreading Threads communicate via shared variables Threads run on uni- or multi-processors Semantics given as abstract rules : Java Memory Model (JMM) Supports locking of shared variables via synchronized statements Locking may be avoided …. ICSE 2002, Orlando FL 2 Shared variable access without locks Initially : A = 0, B = 0 B = 1; while (A != 1) {}; A = 1; return B || Expected returned value = 1 Locks not used, but using A as flag. ICSE 2002, Orlando FL 3 What may happen B = 1; while (A != 1) {}; A = 1; return B || A= 1 May happen if ….. return B Threads are executed on different processors. B = 1 Returned value = 0 ICSE 2002, Orlando FL 4 Sequential Consistency Programmer expects statements within a thread to complete in program order: Sequential Consistency I. Each thread proceeds in program order II. Operations across threads are interleaved Op1 Op’’ Op2 Op’ violates SC Op’ ; Op1; Op2; || ICSE 2002, Orlando FL Op’’ ; 5 Is this a problem ? Programmers expect SC Verification techniques assume SC execution SC not guaranteed by execution platforms Not demanded by Java language spec. Unrealistic for any future spec. to demand YES !! ICSE 2002, Orlando FL 6 Organization Shared variable access without locks Candidate solutions Specifying the Java Memory Model (JMM) Using JMM for verification ICSE 2002, Orlando FL 7 1. Program with caution Always synchronize (and acquire lock) before writing a shared object For these programs, any execution is SC Unacceptable performance overheads for low-level libraries Software libraries from other sources cannot be guaranteed to be properly synchronized ICSE 2002, Orlando FL 8 2. Change the Semantics Current semantics called Java Memory Model (JMM). Part of Java language spec. Weaker than Sequential Consistency. Specifies which re-orderings are allowed within a thread. Currently being considered for revision Existing/future JMM bound to be weaker than SC – Does not solve the problem ICSE 2002, Orlando FL 9 3. Use the semantics Develop a formal executable description of the Java Memory Model Use it for static checking of programs JMM captures all possible behaviors for any implementation – Platform independent reasoning Adds value to existing program verification techniques ICSE 2002, Orlando FL 10 Organization Shared variable access without locks Candidate solutions Specifying the Java Memory Model (JMM) Using JMM for verification ICSE 2002, Orlando FL 11 JMM Overview Each shared variable v has • A master copy • A local copy in each thread Threads read and write local/master copies by actions Imposes ordering constraints on actions Constraints are informal and declarative Hard to understand !! ICSE 2002, Orlando FL 12 JMM Actions read(v,t) < load(v,t) Master copy of v Local copy of v in t write(v,t)> store(v,t) Actions invoked by Program Execution: use/assign(t,v) Read from/ Write into local copy of v in t lock/unlock(t) Acquire/Release lock on shared variables ICSE 2002, Orlando FL 13 Formal Specification Asynchronous concurrent composition • Th1 || Th2 || … || Thn || MM Local state of each thread modeled as cache • ( Local copy, Stale bit, Dirty bit ) • Queues for incomplete reads/writes Local state of MM • ( Master copies, Lock ownership info ) ICSE 2002, Orlando FL 14 Specifying JMM Actions Each action is a guarded command G B Only 8 actions capture all the rules. Example: Use of variable v by thread t • stale[t,v] return local_copy[t,v] Applicability of action stated as guard In rule-based description, several rules determine the applicability ICSE 2002, Orlando FL 15 Understanding JMM assign(t,v) assign(t, v) < load(t,v) < store(t,v) < load(t,v) A store must intervene between an assign and a load action for a variable v by thread t ICSE 2002, Orlando FL 16 Understanding JMM assign(t, v) assign(t,v) < assign(t,v) < load(t,v) store(t,v) < load(t,v) < store(t,v) < write(t,v) < read(t,v) < load(t,v) read < assign <load is not possible. We specify assign(t,v) : empty(read_queue[t,v]) -> …. ICSE 2002, Orlando FL 17 Executable model Java threads invoke use,assign,lock,unlock Threads block if the next action not enabled To enable these, store,write,load,read are executed in any order provided guard holds Example: To enable assign, a load is executed (if there was an earlier read) Proof of equivalence between executable and rule-based JMM. ICSE 2002, Orlando FL 18 Other JMM Features Executable model also captures: • Volatile variables : Every access of these variables accesses master copy • Prescient stores: Writing master copy ahead of local copy • Waiting and Notification Modeled by additional guarded commands. ICSE 2002, Orlando FL 19 Organization Shared variable access without locks Candidate solutions Specifying the Java Memory Model (JMM) Using JMM for verification ICSE 2002, Orlando FL 20 Verifying Unsynchronized Code assign(B,1) assign(A,1) While (use(A) !=1){} || use(B) use/assign invoke corresponding guarded commands load/store/read/write executed to enable use/assign Exhaustive state space exploration shows use(B) may return 1 or 0 ICSE 2002, Orlando FL 21 Program verification Composing executable JMM allows search The state space explosion problem Most Java programs are “properly synchronized” and hence SC execution Unsynchronized code appears in low-level fragments which are frequently executed These programs are small, so … ICSE 2002, Orlando FL 22 One possibility User chooses one program path in each thread (Creative step) Exhaustively check all possible execution traces from chosen program paths (Automated state space exploration: Can verify invariants) Choosing program paths requires understanding source code, not JMM ICSE 2002, Orlando FL 23 Case Study: Double Checked Locking Idiom for lazy instantiation of a singleton Check whether garbage data-fields can be returned by this object initialization routine Verification by composing the JMM reveals: • Incompletely instantiated object can be returned • Due to re-ordering of writes within constructor • Detected by prototype invariant checker in 0.15 sec ICSE 2002, Orlando FL 24 Summary Executable specification of Multithreaded Java semantics Using the specification for checking multithreaded programs Similar approach has been studied before in the context of hardware multiprocessors How to correct the bugs found (the harmful re-orderings) ? ICSE 2002, Orlando FL 25