The Case for Byzantine Fault Detection Petr Kouznetsov Peter Druschel Andreas Haeberlen

advertisement
The Case for Byzantine Fault Detection
Andreas Haeberlen
MPI-SWS / Rice University
© 2006 Andreas Haeberlen, MPI-SWS
Petr Kouznetsov
Peter Druschel
MPI-SWS
MPI-SWS
1
Challenge: Byzantine faults

Distributed systems are subject to
a variety of failures and attacks







Hacker break-in
Freeloading
Censorship
Data corruption
Software/hardware failure
Byzantine failure model: Faulty nodes may exhibit
arbitrary behavior
Dependable systems must be protected against
Byzantine faults
© 2006 Andreas Haeberlen, MPI-SWS
2
Existing approach: Fault tolerance
Server
replicas
Client


Byzantine fault tolerance (BFT) can mask a limited
number of Byzantine faults
Example: Castro and Liskov [OSDI'99]
© 2006 Andreas Haeberlen, MPI-SWS
3
Byzantine Fault Detection




Alternative approach: Fault detection
Nodes monitor each other for faulty behavior
When a fault occurs, the correct nodes
 identify the faulty node(s)
 distribute evidence of the fault
Nodes can isolate the faulty node + initiate recovery
© 2006 Andreas Haeberlen, MPI-SWS
4
Byzantine Fault Detection
A
A
Set X=5
E
D




B
OK
C
A
E
B
D
C
E
B
D
C
Alternative approach: Fault detection
Nodes monitor each other for faulty behavior
When a fault occurs, the correct nodes
 identify the faulty node(s)
 distribute evidence of the fault
Nodes can isolate the faulty node + initiate recovery
© 2006 Andreas Haeberlen, MPI-SWS
5
Best approach depends on the application
Sprint
Machine room
AT&T
Level3
Inter-domain routing
Air traffic control



Failures may be fatal!
Goal: Mask fault symptoms
Delays negligible, bandwidth
plentiful, few nodes
Typical application for Fault Tolerance
© 2006 Andreas Haeberlen, MPI-SWS



Best-effort service
Goal: Find faulty components
Wide-area delays, limited
bandwidth, many nodes
Typical application for Fault Detection
6
Detection can provide accountability

In an accountable system:



Actions are undeniable
State is tamper-evident
Correctness can be certified

Good nodes can provide evidence that they are good
Bad nodes cannot hide evidence of misbehavior

Proven concept in society



Banking, administration ...
Desirable for distributed systems [Yumerefendi05]

Example: Building trust in federated systems
© 2006 Andreas Haeberlen, MPI-SWS
7
What about performance?

If up to f nodes can be faulty, we need f+1 replicas to
guarantee detection (fault tolerance: 3f+1)



Detection can defer overhead to periods of low load


More throughput using the same resources
Works even when >33% of the nodes can become faulty
System can deliver high peak throughput
Detection does not require consensus

Potentially less expensive than BFT
© 2006 Andreas Haeberlen, MPI-SWS
8
Outline




Introduction
BFD abstraction
PeerReview algorithm
Conclusion
© 2006 Andreas Haeberlen, MPI-SWS
9
How is BFD used?
Node X
is faulty!
Application
State machine
Detector
?
No assumptions
about faulty nodes



Network
Each correct node has state machine + detector
Detector can inspect all messages at its local node
When detector observes a fault on another node,


it informs its local application, and
it provides evidence of the fault to other detectors
© 2006 Andreas Haeberlen, MPI-SWS
10
Only observable faults can be detected
A
B
C
C
A
B
Set X=5
Set X=5
OK
OK
OK
Get X
Get X
5
7
Detectably faulty
C
Get X
Detectably ignorant
Two classes of observable faults:



B
Set X=5
Correct

A
Detectable faultiness: Node breaks the protocol
Detectable ignorance: Node refuses to respond
As long as the faulty node continues to follow the
protocol, BFD cannot detect this!
© 2006 Andreas Haeberlen, MPI-SWS
11
BFD can give strong guarantees

Three types of detector output



"No false negatives"
Suspected
Exposed
Strong accuracy


Trusted, suspected, exposed
Strong completeness

Trusted
"No false positives"
Precise definitions are in the paper
© 2006 Andreas Haeberlen, MPI-SWS
12
Outline




Introduction
BFD abstraction
PeerReview algorithm
Conclusion
© 2006 Andreas Haeberlen, MPI-SWS
13
Assumptions
1.
2.
3.
Protocol can be modeled as a deterministic
state machine
Each node has a strong identity, as well as a
public/private keypair for signing messages
The faulty nodes cannot
 prevent two correct nodes from communicating
 break the cryptographic keys
© 2006 Andreas Haeberlen, MPI-SWS
14
Secure logging
A
Rcv(A, "Set X=5")
Send(A, "Okay")
Rcv(C, "Get X")
Send(C, "5")



B
C
B's log
Snd(B, "Set X=5")
Rcv(B, "Okay")
Snd(B, "Get X")
Rcv(B, "5")
All messages are signed and acknowledged
Each node keeps a log of all local inputs and outputs
Nodes must commit to the contents of their log

Log is tamper-evident [Maniatis02]
© 2006 Andreas Haeberlen, MPI-SWS
15
Detecting ignorance
A
Rcv(A, "Set X=5")
Send(A, "Okay")
Recv(C, "Get X")
B
C

If a node refuses to acknowledge a message



Send message as evidence to other nodes
Correct nodes will challenge the ignorant node to prove that
its log contains a 'Rcv' entry for that message
A correct node can always respond
© 2006 Andreas Haeberlen, MPI-SWS
16
Detecting faultiness
A
Rcv(A, "Set X=5")
Send(A, "Okay")
Rcv(B, "Get X")
Send(B, "7")
Snapshots
B'
Rcv(A, "Set X=5")
Send(A, "Okay")
Rcv(B, "Get X")
Send(B, "7")
B
State machine B
is expected to run
Rcv(A, "Set X=5")
Send(A, "Okay")
Rcv(B, "Get X")
Send(B, "5")
C



Nodes can audit each other's log at any time
Auditors replay input in the log, compare output
If a divergence is detected


Send log as evidence to other nodes
Other nodes can repeat the same procedure to check
whether the node is really faulty (no he-said-she-said!)
© 2006 Andreas Haeberlen, MPI-SWS
17
Summary



New approach: Byzantine Fault Detection
 Alternative to fault tolerance
 Provides accountability
Fault Detection can give strong guarantees
 Eventual strong accuracy and completeness
Early results indicate Fault Detection is practical
 Example: PeerReview algorithm
Thank you!
© 2006 Andreas Haeberlen, MPI-SWS
18
Download