Because a plain BFT system cannot scale. You can start with a

A hybrid DHT and BFT approach for the Adder Bulletin board distributed system. Why? Because a plain BFT system cannot scale. You can start with a minimum of 4 nodes to tolerate 1 faulty and 1 partitioned server but all of them will work in lockstep, without any kind of load balance. However, if you add more servers to the problem in order to either scale or tolerate more faulty nodes, the system will perform at most equally or worse, due to the increased number of messages to be exchanged in the 3-phase synchronization protocol. Besides, in a WAN operation, the latency of the network will affect the system considerably as every node communicates with all other nodes. Additionally, Adder’s bulletin board does not need total ordering of messages per stage. In other words, although stage switching must be performed (virtually) synchronously by all nodes, the intra-stage messages can be inserted in any order. The single and most annoying exception to this relaxation of consistency requirements is that messages from the same user have to be partially ordered, i.e. message m+1 from user x should never appear before message m in any replica that stores his messages. What? A DHT is a well understood and efficient way to partition a range of keys deterministically. The central idea is that the client program can decide via the hash function the subset of the servers that are responsible for the range that includes the authenticated user and submit all messages to this range. A crucial assumption, borrowed from the static membership approach of the BFT approach, is that the client knows all nodes of the system. For the DHT approach, Chord (Stoika et all 2001) will be used as a reference platform, along with ideas from CFS (Dabek et all, 2001). For the BFT state machine replication, Practical BFT (Castro & Liskov 1999) will be used as a reference platform. This should not affect generality as no particular characteristics of these systems will be used (???) How? A set of <n> servers participates in a single DHT. A parameter <k> defines the number of nodes that will replicate the operations the users submit. <k> implies that a single node participates in <k> address ranges, hence the partitions are not <n> but <n>/<k>. A user signs on the client software and given his user name, the latter obtains the <key> of the partition he belongs to via the consistent hash function (e.g. SHA-1). Whatever messages the user generates are transmitted to the set of <k> successive servers starting from the one responsible for the address range <key> belongs to. These <k> servers form a group that mimics the operation of a single BFT group. The important property to be preserved is that messages arriving from a single user have to be ordered. The authority to enforce this ordering can be the responsible for the address range that includes <key>. The issue however how a faulty such server can be handled gracefully. One solution can be the idea of the “primary” from the BFT approach. The group maintains “views” and view number mod k is the primary for the range. This looks overly complicated though. Another can be the idea of a server leaving the ring of Chord, hence its successor takes over the faulty one’s address range. The key problems here are (a) how will the clients learn that fact and (b) how will the remaining servers allow the faulty server to rejoin the group once it is repaired. (a) does not look too difficult if the client multicasts his requests to the whole <k> nodes. However, <b> needs some more thought.

Because a plain BFT system cannot scale. You can start with a

Related documents

Products

Support

Because a plain BFT system cannot scale. You can start with a

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib