HLDVT 2009 Dynamic Verification of Multicore Communication Applications in MCAPI Subodh Sharma, UofU Ganesh Gopalakrishnan, UofU Eric Mercer, BYU Supported by NSF CCF 0903408/0903491 and SRC 2009-TJ-1993/1994 1 Concurrency Space in Multicore Era S/W solution H/W solution Cloud/HPC Computing MPI, PVM, Mapreduce, MSCS Clusters, Vector machines, Supercomputers Desktop computing OOSD, Scripting, Pthreads, TBB, CT, OpenMP Desktop machines Embedded Computing Lightweight counterparts of the above. FPGAs, DoC, SoC, etc. 2 Concurrency Space in Multicore Era S/W solution H/W solution Cloud/HPC Computing MPI, PVM, Mapreduce, MSCS Clusters, Vector machines, Supercomputers Server/Desktop computing OOSD, Scripting, Pthreads, TBB, CT, OpenMP Desktop machines Embedded Computing Lightweight counterparts of the above. FPGAs, DoC, SoC, etc. Requires Standardization Multicore Association (MCA) Effort 3 What is MCAPI (Multicore Communication API)? • An API specification from MCA (Multicore Association) • To program embedded systems like mobile phones, PDAs, routers, servers, etc. • Not restricted to SPMD (like MPI) or multi threaded style of programming. 4 Taxonomy for MCA APIs MCAPI – Communication API • Message based • Packet/Scalar Channel based Inter API interactions MTAPI – Task Management API • Specification work yet to begin • Thread Pooling, Work Stealing queues e.g. CILK, TBB, TPL etc. MRAPI – Resource Management API • Semaphores, mutexes • Shared memory segment allocation, deallocation 5 State of MCA Efforts? • MCAPI standard is released. • MCAPI reference implementation exists • A commercial implementation – Polycore, Inc. • In-house implementation on FPGA in progress (Utah). • MRAPI standard work in progress. • MTAPI efforts yet to begin. 6 State of MCA Efforts? • Standardization efforts take a long time to mature • Need for FV in MCA API development. • Our effort: Early Engagement of FV in the MCA APIs space • To promote early adoption of the API • To promote better programming practices • To help avoid early pit-falls – e.g. unspecified behaviors 7 Achievements so far! • Formal Specification work (at BYU) • Permits symbolic analysis • Murphi -> CVC • Query Oracle for scenario outcome calculation. • Platform testing • Dynamic Verification of MCAPI user apps (at Utah) • MCC tool created. • Current capabilities of MCC • Dynamic search of dependencies • Exerting a deterministic runtime control • Algorithmic Advances • Symmetry reduction in MCAPI domain space. 8 Focus of the talk! • Formal Specification work (at BYU) • Permits execution based symbolic analysis • Murphi -> CVC • Query Oracle for scenario outcome calculation. • Platform testing • Dynamic Verification of MCAPI user apps (at Utah) • MCC tool created. • Current capabilities of MCC • Dynamic search of dependencies • Exerting a deterministic runtime control • Algorithmic Advances • Symmetry reduction in MCAPI domain space. 9 Related Work • CHESS tool from MSR • Uses preemption bounded search • Cannot be used for message passing programs • ISP tool from Utah • Used for MPI verification • MCC differs in the way determinism is enforced at runtime. • Inspect tool from Utah • Used for shared memory programs • Implements the classical DPOR algorithm. • JPF • A model checker for Java bytecode. 10 MCAPI Preliminaries • MCAPI Node – • Processes, Threads, OS instance, Processor core, etc. • MCAPI Endpoint – • Socket-like communication termination points • Identified by a tuple <node_id, port_id> • MCAPI Channels • Point to point unidirectional FIFO connections between a pair of endpoints. • A single communication universe 11 MCAPI Messages • Message received at FIFO receive queue Node 1 Connectionless Message Port 1 <1,1> To <2,1> Node 2 Port 1 <2,1> Port 2 <1,2> 12 MCAPI Connection Oriented Communication • Sent over a connected channel • <1,1> can communicate on ONE channel only Node 1 Node 2 Port 1 <1,1> Port 2 <1,2> Port 1 <2,1> To <2,1> ERROR!! To <3,1> Node 3 Port 1 <3,1> 13 MCAPI API generic + message calls Initialize/Finalize Sets and deletes the environment. Must be called by each communicating node. Create/Delete endpoint API calls to manage creation/deletion of endpoints on the owner node Get_endpoint A blocking call to get the address of a remote endpoint. Send (send_endpoint, recv_endpoint .. ) Sends a data-payload from a sending endpoint to a receiving endpoint. API provides blocking and non blocking versions. Recv (recv_endpoint, ..) Receives a data payload from the receiving endpoint. API provides blocking non-blocking versions. Wait (request_handle, ..), Test(req_handle, ..) Checks the completion of the request. Wait – Blocking; Test – Non blocking 14 Illustration of MCAPI C program int main () { … for(t=0; t<NUM_THREADS; t++){ rc = pthread_create(&threads[t], NULL, run_thread, (void*)&thread_data_array[t]); } for (t = 0; t < NUM_THREADS; t++) { pthread_join(threads[t],NULL); } } void* run_thread (void *t) { … mcapi_msg_recv(recv_endpt,msg, BUFF_SIZE,&recv_size, &status); mcapi_msg_recv(recv_endpt,msg, BUFF_SIZE, &recv_size, &status); } else { Potential send_endpt = mcapi_create_endpoint deadlock (PORT_NUM,&status); recv_endpt = mcapi_get_endpoint (1,PORT_NUM,&status); mcapi_msg_send(send_endpt,recv_endpt, msg,strlen(msg), mcapi_initialize(tid,&version,&status 1,&status); ); } if (tid == 1) { mcapi_finalize(&status); recv_endpt = return NULL; mcapi_create_endpoint (PORT_NUM, } &status); 15 MCC – MCAPI Checker MCAPI C Program instrumentation Instrumented Program compile Executable thread 1 request/permit Scheduler thread n MCAPI Library Wrapper Workflow of MCC. 16 Bug Classes in MCAPI User Applications • Non determinism due to communication race. • The recv() call is passed with only receiving endpoint as a parameter. • The receiving endpoint extracts message from the FIFO receive queue. T0 T1 T2 --- --- --- S(e1,e2) S(e1,e2) R(e2) The send that matches the recv may decide the correctness !! R(e2) 17 Bug Classes in MCAPI User Applications • Mismatched Calls T0 T1 --- --- R(1) Does not exist Get_endpoint(2) S(1,2) “get_endpoint” invoked for a non-existent endpoint -- ERROR 18 Challenge: exponential interleavings T0 T1 T2 A T3 T4 T5 Possible Interleaving B Only these 2 are RELEVANT!!! Dependent MCAPI calls 19 Relevant interleaving exploration by MCC scheduler T0 S (ep1,ep3) T1 S(ep2,ep3) T2 R(ep3) Intra happens-Before Edge R(ep3) 20 Relevant interleaving exploration by MCC scheduler Interleaving 1 T0 T1 S (ep1,ep3) S (ep2,ep3) Issue to the runtime Set of matching transitions T2 R (ep3) R (ep3) No runnable thread Match sets are computed at “decision points” Enabled Transitions: S(ep1, ep3), S(ep2, ep3), R (ep3) Match Set: <S(ep1, ep3), R (ep3)> <S(ep2, ep3), R (ep3)> 21 Relevant interleaving exploration by MCC scheduler Interleaving 1 T0 S (ep1,ep3) T1 S (ep2,ep3) T2 R (ep3) R (ep3) Enabled Transitions: S(ep2, ep3), R (ep3) Match Set: <S(ep2, ep3), R (ep3)> 22 Relevant interleaving exploration by MCC scheduler Interleaving 2 T0 S (ep1,ep3) T1 S (ep2,ep3) Issue to the runtime T2 R (ep3) R (ep3) Enabled Transitions: S(ep1, ep3), S(ep2, ep3), R (ep3) Match Set: <S(ep2, ep3), R (ep3)> 23 Relevant interleaving exploration by MCC scheduler Interleaving 2 T0 S (ep1,ep3) T1 S (ep2,ep3) Issue to the runtime T2 R (ep3) R (ep3) Enabled Transitions: S(ep2, ep3), R (ep3) Match Set: <S(ep1, ep3), R (ep3)> 24 Relevant interleaving exploration by MCC scheduler Interleaving 2 T0 S (ep1,ep3) T1 S (ep2,ep3) Issue to the runtime T2 R (ep3) R (ep3) Enabled Transitions: S(ep2, ep3), R (ep3) Match Set: <S(ep1, ep3), R (ep3)> 25 Enforcing a deterministic runtime match • For non blocking requests, there is an additional problem • Runtime communication race even after a scheduler decides deterministically. T0 IS (ep1,ep3) T1 IS (ep2,ep3) Match-set 2 T2 Match-set 1 IR (ep3) IR (ep3) 26 Enforcing a deterministic runtime match T0 T1 RACE IS (ep1,ep3) IS (ep2,ep3) T2 Match-set 1 IR (ep3) Not finished yet!! Match-set 2 IR (ep3) 27 Enforcing a deterministic runtime match Solution 1 to enforce a deterministic match. • Addition of Probing function to the MCAPI library • Probes an endpoint’s receive queue for a message • Returns TRUE if the message is found • Scheduler issues the next match-set only when the probe function returns TRUE. 28 Enforcing a deterministic runtime match Probing Solution T0 T1 T2 Receive Queue IS (ep1,ep3) IS (ep2,ep3) matched IR (ep3) IR (ep3) 29 Enforcing a deterministic runtime match Probing Solution T0 T1 T2 Receive Queue IS (ep1,ep3) IS (ep2,ep3) matched IR (ep3) IR (ep3) Probes Runtime MCC Scheduler 30 Enforcing a deterministic runtime match Probing Solution T0 T1 T2 T1 IS (ep1,ep3) IS (ep2,ep3) Receive Queue IR (ep3) IR (ep3) Probes Runtime MCC Scheduler 31 Enforcing a deterministic runtime match Probing Solution T0 T1 T2 T0 T1 IS (ep1,ep3) IS (ep2,ep3) Receive Queue IR (ep3) IR (ep3) 32 Enforcing a deterministic runtime match Solution 2 to enforce a deterministic match. • Scheduler induces a “test” call • Scheduler spin-loops on the request handle until the successful completion of the “test” call. We opt Solution 2 in our work as it is non-intrusive. 33 Enforcing a deterministic runtime match Dummy “wait” Solution T0 IS (ep1,ep3) T1 IS (ep2,ep3) matched T2 IR (ep3) IR (ep3) Enabled Transitions: Send(ep1, ep3), Send(ep2, ep3), Recv (ep3) Match Set: <Send(ep2, ep3), Recv (ep3)> <Send(ep1, ep3), Recv (ep3)> 34 Enforcing a deterministic runtime match Dummy “wait” Solution T0 T1 IS (ep1,ep3) IS (ep2,ep3) T2 IR (ep3) Test () IR (ep3) Runtime MCC Scheduler 35 Enforcing a deterministic runtime match Dummy “wait” Solution T0 T1 IS (ep1,ep3) IS (ep2,ep3) T2 IR (ep3) Test() matched Runtime IR (ep3) MCC Scheduler 36 Concluding remarks and future work • MCC currently supports connectionless API constructs • Checks for safety assertion violations and Deadlocks • Steadily improve MCC over this year • Support for connection-oriented API calls, sanity checks etc. • Release MCC and assess experience of industrial app writers 37 www.cs.utah.edu/formal_verification 38 Happens-Before Relationship – Computed Dynamically T0 IS (ep1,ep3, r1) IS (ep1,ep4, r2) W(r2) T1 IS (ep2,ep3, r3) T2 IR (ep3, r4) IR (ep3, r5) R (ep3) 39 Review of POR Dynamic POR Dependencies based on concrete values t1: a[x] := … t2: … := a[y] t2: … := a[y] t1: a[x] := … x=56, y=753 (e.g.) at runtime! Equal ?? 40 Happens-Before Relationship – Example T0 IS (ep1,ep3, r1) IS (ep1,ep4, r2) W(r2) T1 IS (ep2,ep3, r3) T2 IR (ep3, r4) IR (ep4, r5) R (ep3) 41 OOO completion: Why construct HappensBefore relationship? T0 IS (1GB to T1) IS (1 byte to T2) Must calls complete in PROGRAM ORDER? NO!!! 42