Group Communication Group oriented activities are steadily increasing. There are many types of groups: Open and Closed groups Peer-to-peer and hierarchical groups We focus on how the members of a group communicate Major issues Different types of multicast Dynamic groups Atomic multicast Ordered multicast How to communicate when the membership constantly changes Failure handling Keeping track of membership and membership changes Atomic multicast A multicast is called atomic, when the message is delivered to every correct member, or to no member at all. How can we implement atomic multicast? Basic vs. reliable multicast Basic multicast does not consider crash failures. Reliable multicast does. Three criteria for basic multicast: Liveness. Each process must receive every message Integrity. No spurious message received No duplicate. Accepts exactly one copy of a message Reliable atomic multicast Sender’s program Receiver’s program i:=0; do i ≠ n send message to i; i:= i+1 od if m is new accept it; multicast m; m is duplicate discard m fi Tolerates process crashes. Multicast support in networks Sometimes, certain features available in the infrastructure of a distributed system simplify the implementation of multicast. Examples are Multicast on an ethernet LAN IP multicast IP Multicast Although multicasts can be implemented using point-to-point communications, there are some practical forms of multicasts that make use of the inherent multicasting ability of the underlying medium. IP multicast is a bandwidth-conserving technology that reduces traffic by simultaneously delivering a single stream of information to multiple clients. Applications that take advantage of multicast include distance learning, videoconferencing, and distribution of software, stock quotes, and news. The source sends only one copy, which is replicated by the routers. Distribution trees Class D addresses are assigned to multicast groups. 2 A 1 source B 2 4 Routers maintain and update distribution trees when members join / leave a group. D Concern: Too much load on routers. Application layer multicast overcomes this. F 15 C E 6 7 (a) Source tree rendezvous point 2 1 1 source A B D 2 4 source F 15 C E 7 (b) Shared tree 1 6 Ordered multicasts Total order Causal order Local order (a.k.a. Single source FIFO) Definitions? Applications? Total order multicast is useful in the consistent update of replicated servers Causal order multicast is relevant in implementing bulletin boards Local order multicast is useful in updating cache memories in multi-computers Implementing total order multicast Basic multicast using a sequencer {The sequencer S} define seq: integer (initially 0} do receive m multicast (m, seq); seq := seq+1; deliver m od sequencer Implementing basic total order multicast Distributed implementation without a sequencer Uses the idea of 2PC p q r 3 18 4 22 6 19 7 10 14 Implementing basic total order multicast Step 1. Sender i sends (m, ts) to all Step 2. Receiver j saves it in a holdback queue, and sends an ack (a, ts) Step 3. Receive all acks, and pick the largest ts. Then send (m, ts, commit) to all. Step 4. Receiver removes it from the holdback queue and delivers m in the ascending order of timestamps. Why does it work? Implementing basic causal order multicast Use vector clocks.The recipient i delivers a message from j iff 1,0,0 0,0,0 (1,1,0) P0 m1 1. VCj(j) = LCj(i) +1 {LC = local vector clock} m1 m2 m3 1,1,0 0,0,0 P1 (2,1,0) (1,0,0) m2 2. k: k≠j :: VCk(j) ≤ LCk(i) m3 2,1,1 0,0,0 VC = incoming vector clock LC = Local vector clock 2,1,0 P2 (1,0,0) (1,1,0) (2,1,0) ? (violation) Note the slight difference in the implementation of the vector clocks Reliable multicast Tolerates process crashes. The additional requirements are as follows: Only correct processes are required to receive the messages from all correct processes in the group. Multicasts by faulty processes will either be received by every correct process, or by none at all. A theorem on reliable multicast In an asynchronous distributed system, total order reliable multicasts cannot be implemented when even a single process undergoes a crash failure. Why? Since it will violate the FLP impossibility result. Scalable Reliable Multicast IP multicast or application layer multicast has to detect the loss of messages and use retransmission for achieving reliability. For large groups (like distance learning applications) scalability is a major problem. Scalable Reliable Multicast Difficult to scale: Sender state explosion Message implosion State: receiver 1, receiver 2, … receiver n Scalable Reliable Multicast If omission failures are rare, then receivers will only report the non-receipt of messages using NACK, This has the potential to trigger selective point-to-point retransmission The reduction of acknowledgements is the underlying principle of Scalable Reliable Multicasts (SRM). If several members of a group fail to receive a message, then each such member waits for a random period of time before sending its NACK. This helps to suppress redundant NACKs. Sender multicasts the missing copy only once.