Civilian Worms: Ensuring Reliability in an Unreliable Environment Sanjeev R. Kulkarni

Civilian Worms: Ensuring Reliability in an Unreliable Environment Sanjeev R. Kulkarni University of Wisconsin-Madison sanjeevk@cs.wisc.edu Joint Work with Sambavi Muthukrishnan Outline    Motivation and Goals Civilian Worms Master-Worker Model     Leader Election Forward Progress Correctness Parallel Applications What’s happening today   Move towards clusters Resource Managers   eg. Condor Dynamic environment Motivation   Large Parallel/Standalone Applications Non-Dedicated Resources    Unreliable commodity clusters    eg.:- Condor env. Machines can disappear at any time Hardware failures Network Failures Security Attacks! What’s available  Parallel Platforms  MPI    PVM   MPI-1 :- Machines can’t go away! MPI-2 any takers? Shoot the master! Condor  Shoot the Central Manager! Goal     Bottleneck-Free infrastructure in an unreliable Environment Ensure “normal termination” of applications Users submit their jobs Get e-mail upon completion! Focus of this talk  Approaches for Reliability  Standalone Applications    Monitor framework ( worms! ) Replication Parallel Applications  Future work! Worms are here again!  Usual Worms    Self replicating Hard to detect and kill Civilian Worms    Controlled replication Spread legally! Monitor applications Desired Monitoring System C W W C W W = worm C = computation W Issues  Management of worms    Forward Progress   Distributed State detection Very hard Checkpointing Correctness Management Models  Master-Worker     Simple Effective Our Choice! Symmetric  Difficult to manage the model itself! Our Implementation Model Master W = worm C C = computation W Workers W C W C W Worm States  Master     Worker    Periodically ping the master Starts the encapsulated process if instructed Leader Election   Maintains the state of all the worm segments Listens on a particular socket Respawns failed worm segments Invoke the LE algorithm to elect a new master Note:- Independent of application State Leader Election   The woes begin! Master goes down  Detection  Worker ping times out    Timeout value Worker gets an LE message Action  Worker goes into LE state LE algorithm  Each worm segment is given an ID    Only master gives the id Workers broadcast their ids The worker with the lowest id wins Brief Skeleton  While in LE    bcast LE message with your id Set min = your id On getting an LE message with id i    If i >= min ignore else min = i; min is the new Master LE in action (1) M0 W1 W2 Master goes down! LE in action (2) LE, 1 LE, 2 L2 L1 LE, 1 LE, 2 L1 and L2 send out LE messages LE in action (3) L1 COORD_ACK L2 L1 gets LE, 2 and ignores it L2 gets LE, 1 and send COORD_ACK LE in action (4) W3 spawn M1 COORD W2 M1 send COORD to W2, spawns W0 Implementation Problems    Too many cases Many unclear cases Time to Converge   Timeout values Network Partition What happens if?  Master still up?    Next master in line goes down?   Incoming id < self id => goes to LE mode Else => sends back COORD message Timeout on COORD message receipt Late COORD_ACK?  Sends KILL message More Bizarre cases  Multiple Masters?    Master bcasts its id periodically Conflict is resolved using lowest id method No-master?  Workers will timeout soon! Test-Bed      64 dual processor 550 MHz P-III nodes Linux 2.2.12 2 GB RAM Fast interconnect. 100 Mbps Master-Worker comm. via UDP A Stress Test for LE  Test      Worker Pings every second Kill n/4 workers After 1 sec, kill the master After .5 sec kill the master in line Kill n/4 workers again Convergence Converge time in secs Convergence Graph 35 30 25 20 15 10 5 0 2 4 8 Cluster Size 16 Forward Progress  Why?   MTTF < application time Solutions  Checkpointing    Application Level Process level Start from checkpoint image! Checkpoint  Address Space  Condor Checkpoint library    Rewrites Object files Writes checkpoint to a file on SIGUSR2 Files  Assumption :- Common File System Correctness  File Access   Read Only, no problems Writes    Possible inconsistency if multiple processes access Inconsistency across checkpoints? Need a new File Access Algorithm Solution: Individual Versions  File Access Algorithm  On open  If first open    Else    read: nothing write: create a local copy and set a mapping If mapped access mapped file If write: create a local copy and set a mapping Close  Preserve the mapping File Access cont.  Commit Point   On completion of the computation Checkpoint  Includes mapped files Being more Fancy   Security Attacks Civilian to Military transition   Hide yourself from the ps Re-fork periodically to avoid detection Conclusion  LE is VERY HARD   Does our system work?    Don’t take it for a course project! 16 nodes: YES 32 nodes: NO Quite Reliable Future Direction   Robustness Extension to parallel programs    Re-write send/recv calls Routing issues Scalability issues?  A hierarchical design? References    Cohen, F. B., ‘A Case for Benevolent Viruses’, http://www.all.net/books/integ/goodvcase.html M. Litzkow and M. Solomon. “Supporting Checkponting and Process Migration outside the UNIX kernel”, Usenix Conference Proceedings, San Francisco, CA, January 1992. Gurdip Singh, “Leader election in complete networks”, PPDC 92 Implementation Arch. Worm Communicator Dispatcher Dequeuer Remove Checkpointer Checkpoint Prepend Computation Append Parallel Programs  Communication    Connectivity across failures Re-write send/recv socket calls Limitations of Master-Worker Model?  Not really! Communication  Checkpoint markers   Buffer all data between checkpoint markers Help of master in rerouting

Civilian Worms: Ensuring Reliability in an Unreliable Environment Sanjeev R. Kulkarni

Related documents

Products

Support

Civilian Worms: Ensuring Reliability in an Unreliable Environment Sanjeev R. Kulkarni

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib