Fail-Stop Processors

advertisement
UNIVERSITY of WISCONSIN-MADISON
Computer Sciences Department
CS 739
Distributed Systems
Andrea C. Arpaci-Dusseau
Fail-Stop Processors
Byzantine Generals in Action: Implementing Fail-Stop
Processors, Fred Schneider, TOCS, May 1984
• Example usage of byzantine agreement
• Why fail-stop processors can simplify replicated
services
• Why fail-top processors are expensive (impractical?)
to build
Remaining Time: Byzantine Werewolves (improved?)
Motivation
Goal: Build systems that continue to work in
presence of component failure
Difficulty/cost of building those systems depends
upon how components can fail
Fail-stop components make building reliable
systems easier than components with byzantine
failures
Fail-Stop Processors
What is a failure?
• Output (or behavior) that is inconsistent with specification
What is a Byzantine failure?
• Arbitrary, even malicious, behavior
• Components may collude with each other
• Cannot necessarily detect output is faulty
What is a fail-stop processor?
• Halts instead of performing erroneous transformations
• Others can detect halted state
• Others can access uncorrupted stable storage even after failure
Questions to Answer
1) What are the advantages of fail-stop processors?
2) Real processors are not fail-stop
•
•
Can we build one?
How can we build an approximation of one?
3) Approximations of fail-stop processors are expensive to
build
•
Under what circumstances is replicated service with fail-stop
processors “better”?
1) Distributed State Machine
Common approach for building a reliable system
Idea: Replicate faulty servers, coordinate client interactions
with replicas
input sequence
State machine
Client
Byzantine agreement
R
Combine outputs
R
R
output
T-fault tolerant: Satisfies specification as long as no more
than t components fail
Failure model of components determines how many replicas,
R, are needed and their interactions
How to build t-fault tolerant state
machine?
Inputs
• Key: All replicas receive and process same sequence of inputs
• 1) Agreement: Every nonfaulty replica receives same request (interactive
consistency or byzantine agreement)
• 2) Ordering: Every nonfaulty replica processes requests in same order
(logical clocks)
Outputs
Combine output?
Byzantine
majority
Fail-Stop
any
Number of replicas?
2t+1
t+1
2) Building a Fail-Stop Processor
Must provide stable storage
• Volatile: Lost on failure
• Stable
–
–
–
–
Not affected (lost or corrupted) by failure
Can be read by any processor
Benefit: Recover work of failed process
Drawback: Minimize interactions since slow
Can only build approximation of fail-stop processor
• Finite hardware -> Finite failures could disable all error detection
hardware
k-fail-stop processor: behaves fail-stop unless k+1 or more
failures
Implementation of k-FSP: Overview
Two components
• k+1 p-processes (program)
• 2k+1 s-processes (storage)
• Each process runs on own processor, all connected with network
P-Processes (k+1)
• Each runs program for state machine
• Interacts with s-processes to read and write data
• If any fail (if any disagreement), then all STOP
• Cannot necessarily detect k+1 failures
S-Processes (2k+1)
• Each replicates contents of stable storage for this FSP
• Provides reliable data with k failures (cannot just stop)
• Detects disagreements/failures across p-processes
– How???
Interactive Consistency
Requirements
IC1. If nonfaulty p-process, then every nonfaulty s-process
receives request within  seconds (as measured on sprocess clock)
IC2. Non-faulty s-processes in same k-FSP agree on every
request from p-process j
• S-processes must agree even when p-process is faulty
To provide IC1 and IC2:
• Assuming can authenticate sender of messages,
use signed message (SM) protocol for byzantine agreement
– Need just k+1 processes for agreeement
IC3. For each k-FSP, clocks of all p-processes are
synchronized
• All non-faulty p-processes must send requests at same time to sprocesses
FSP Algorithm Details: Writes
Each p-process, on a write:
• Broadcast write to all s-processes
• Byzantine agreement across all s-processes
(all s-processes must agree on same input value from particular pprocess)
Each s-process, on a write (Fig 1):
• Ensure each p-process writes same value and receive within time
bound
– Initial code: Handle messages after at least time  has transpired
since receipt (every s-process should receive by then)
– If receive write request from all k+1 p-processes (|M| = k+1),
then update value in stable storage
– If not, then halt all p-processes
• Set failed variable to true
• Do not allow future writes
FSP Algorithm Details: Reads
Each p-process, on a read:
• Broadcast request to all s-processes
• Use result from majority (k+1 out of 2k+1)
• Can read from other FSPs as well
– Useful if FSP failed and re-balancing work
Each p-process, determine if halted/failed:
• Read failed variable from s-process
(use majority)
FSP Example
k=2, SM code: “b=a+1”; How many p and s processes?
p:
s:
0
0
1
1
2
2
3
4
a: 6
b:
failed: 0
How do p-processes read a?
1) Broadcast request to each s-process
2) Each s-process responds to read request
3) Each p-process uses majority of responses from sprocess
FSP Example
k=2, SM code: “b=a+1”
p:
s:
0
0
1
1
2
2
How do p-processes read a?
• What if 2 s-processes fail?
– E.g., think a=5?
• What if 3 s-processes fail?
3
4
a:
b:
failed:
FSP Example
k=2, SM code: “b=a+1”
p:
s:
0
0
1
1
2
2
3
4
a:
b:
failed:
How do p-processes write b?
• Each p-process j performs byzantine agreement using signed
message protocol SM(2) across s-processes
– Each s-process must agree on what p-process j is doing, even if j is
faulty
• Each s-process looks at requests after time delta elapsed
• If see same write from all k+1 processes, perform write
– Otherwise, halt all p-processes; forbid future writes
FSP Example
k=2, SM code: “b=a+1”
p:
s:
0
0
1
1
2
2
3
4
a:
b:
failed:
How do p-processes write b?
• What if 1 p-process (or network) is very slow?
• What if 1 p-process gives incorrect request to all s-processes?
• What if 1 p-process gives incorrect request to some?
– Byzantine agreement catches: All s-processes agree that p-process is
faulty (giving different requests); agree to treat it similarly
– When see doesn’t agree with other p-processes, will halt
• What if 3 p-processes give bad result?
3) Higher-Level Example
Goal: Service handling k faults; N nodes for performance
Solution: Use N+k k-failstop processors
Example: N=2, k=3
FSP0
FSP1
FSP2
FSP3
FSP4
SS0
SS1
SS2
SS3
SS4
What happens if:
•
•
•
•
•
3 p-processes in FSP0 fail? 4 p-processes in FSP0 fail?
1 p-process in FSP0, FSP1, and FSP2 fail? also in FSP3?
2 p-processes in FSP0, FSP1, and FSP2 fail?
1 s-process in SS0 fails? also in SS1, SS2, and SS3?
4 s-processes in SS0 fail?
Should we use
Fail Stop Processors?
Metric: Hardware cost for state machines:
• Fail-stop components:
– Worst-case (assuming 1 process per processor):
• (N+k) * [2k+1 + k+1] = (N+k) * (3k+2) processors
– Best-case (assuming s-processes from different FSP share same
processor)
• (N+k)(k+1) + (2k+1) processors
• Byzantine components:
– N * (2k+1)
• Fail-stop can be better if s-processes share and N>k…
Metric: Frequency of byzantine agreement protocol
• Fail-Stop: On every access to stable storage
• Byzantine: On every input read
• Probably fewer input reads
Summary
Why build fail-stop components?
• Easier for higher layers to model and deal with
• Matches assumptions of many distributed protocols
Why not?
• Usually more hardware
• Usually more agreements needed
• Higher-levels may be able to cope with “slightly faulty”
components
– Violates end-to-end argument
Conclusion: Probably shouldn’t assume fail-stop
components
Byzantine Werewolves
Previous :Too easy for villagers to identify werewolves
• Villager A had reliable information that Z was werewolf
• Villager B could validate that A was villager
• Hard for Z to lie that C was werewolf, because D could have checked C
too
Signed Protocol: Many could hear what one said
• Difficult for werewolves to tell different lies to others
• Have to tell everyone same thing
New Changes to give more advantage to werewolves:
• Unknown number of werewolves (1 <= w < 1/2 N)
• Night: Werewolves convert multiple villagers to wolves (1 <= v <= w)
– Key: Info told by moderator will then be stale and wrong!
• Day: Villagers can vote to lynch multiple victims
Byzantine-Werewolf Game Rules
Everyone secretly assigned as werewolf or villager
• W werewolves, rest are “seeing” villagers
• I am moderator
Night round (changed order):
• “Close your eyes”; make noises with one hand to hide activity
• For all: “NAME, open your eyes” “Pick someone to ask about”
–
–
–
–
Useless for Werewolves, but hides their identity…
Point to another player
Moderator signs thumbs up for werewolf, down for villager
“NAME, close your eyes”
• “Werewolves, open your eyes”: W can see who is who
–
–
–
–
“Werewolves, pick villagers to convert”
Moderator picks secret number between 1 and W
Silently agree on villagers by pointing
Moderator taps converts on shoulder; should open eyes to see other
werewolves
– “Werewolves, close your eyes”
Rules: Day Time
Day Time: “Everyone open your eyes; its daytime”
• Agreement time: Everyone talks and votes on who should be
“decommissioned”
–
–
–
–
Villagers try to decommission werewolves
Werewolves try to trick villagers with bad info
Someone must propose who should be killed
Vote until kill villager or no more proposals or no majority
• Werewolves really spread at night, so large incentive to kill as many as
possible now
– Moderator: Uses majority voting to determine who is decommissioned
“Okay, NAME is dead”
– Person is out of game (can’t talk anymore) and shows card
Repeat cycle until
All werewolves dead OR werewolves >= villagers
Download