Bringing Paxos Consensus in Multi-agent Systems Andrei Mocanu Costin Bădică University of Craiova What is consensus? Agreement No two processes decide differently Termination Every correct process eventually decides Validity The value that is decided must be among the values proposed by the processes Why is consensus important? Ensuring processes agree fundamental problem in Distributed Computing Applications in other problems: leader election state machine replication atomic broadcast Why is consensus hard? Fischer-Lynch-Paterson Result impossible to solve consensus in an asynchronous system in the presence of even a single failure. No deterministic fault-tolerant consensus protocol can guarantee progress in an asynchronous network! Paxos Algorithm Developed in the 1980’s by Leslie Lamport, but paper rejected Eventually published in 1998 and simplified in 2001 Used in practice: Google Chubby and Megastore Microsoft Autopilot Apache Hadoop Zookeeper Paxos Algorithm Safety Properties value chosen is from proposed values only one value is chosen value is learned only if chosen Paxos will eventually succeed if a majority of participants is reachable processes know how to generate values Paxos Roles Client process that makes the request Acceptor represent the fault tolerant “memory” organized in groups called Quorums any message sent to an Acceptor must be sent to a Quorum of Acceptors any message received from an Acceptor is ignored unless sent from each Acceptor in a Quorum Paxos Roles Proposer acts on behalf of the Client tries to assemble majority of Acceptors Leader a distinguished Proposer many processes may believe themselves Leaders, but progress is made when only one of them is chosen Paxos Roles Learner assure replication once the decision has been received from the Acceptors, they take action and send the response to the Client more may be added to increase availability Paxos Phases Phase 1a: Prepare a Proposer (the leader) makes a proposal numbered n, greater than any proposal number used by that Proposer in the past the Proposer sends a Prepare message containing n to a Quorum of Acceptors Paxos Phases Phase 1b: Promise if n is higher than proposal numbers received so far by Acceptors, then it promises to ignore future proposals <n if Acceptor had accepted proposal m<n, it will include proposal number m and value u otherwise ignore proposal (or send NACK) Paxos Phases Phase 2a: Accept Request if Proposer receives enough promises from the Quorum, it then chooses the maximum value received from Acceptors or a value it generates if none received the Proposer sends that value to the Quorum in an Accept Request message Paxos Phases Phase 2b: Accepted if Acceptor receives Accept Request message for proposal n then it accepts proposal if it had not promised to accept only greater numbered proposals; it registers the value of the proposal and sends an Accepted message to the Proposer and Learners Otherwise it ignores the Accept Request message (or NACK) Paxos Example Paxos Role Distribution Collapsing Paxos Roles Discovery - JADE Yellow Pages Class Structure & Dependencies Dueling Proposers Scenario Dueling Proposers – Experiment 50% Loss Paxos Algorithm (3 Proposers, 5 Acceptors, 5 Learners) 30% Loss 10% Loss 18 Rounds until Consensus 16 14 12 10 8 6 4 2 0 1 6 11 16 21 26 31 36 41 46 51 56 Run Number 61 66 71 76 81 86 91 96 Dueling Proposers – Experiment 50% Loss Paxos Algorithm (10 Proposers, 5 Acceptors, 5 Learners) 30% Loss 10% Loss 30 Rounds until Consensus 25 20 15 10 5 0 1 6 11 16 21 26 31 36 41 46 51 56 Run Number 61 66 71 76 81 86 91 96 Dueling Proposers – Experiment 10% Loss Paxos Algorithm (3 Proposers, 15 Acceptors, 50 Learners) 30% Loss 50% Loss 20 18 Rounds until Consensus 16 14 12 10 8 6 4 2 0 1 6 11 16 21 26 31 36 41 46 51 56 Run Number 61 66 71 76 81 86 91 96 Experimental Results Summary Configuration/Message Loss 10% 30% 50% 3 Proposer, 5 Acceptor, 5 Learner 1.42 2.18 3.83 10 Proposer, 5 Acceptor, 5 Learner 1.47 2.48 6.36 2.2 4.78 3 Proposer, 15 Acceptor, 50 Learner 1.63 Conclusions We demonstrate an implementation of Paxos using Multi-agent Systems Leverage existing agents in the system to create fault-tolerant layer Experimental analysis of an interesting Paxos edge case in JADE Questions?