+route(C, foo.com)

advertisement
Secure Network Provenance
Wenchao Zhou*, Qiong Fei*, Arjun Narayan*,
Andreas Haeberlen*, Boon Thau Loo*, Micah Sherr+
*University
of Pennsylvania
+Georgetown
http://snp.cis.upenn.edu/
University
Motivation
Route r2
Why did my route to
foo.com change?!
D
E
A
Innocent Reason?
Malicious Attack?
Alice
Route r1

foo.com
B
C
An example scenario: network routing
 System administrator observes strange behavior
 Example: the route to foo.com has suddenly changed
 What exactly happened (innocent reason or malicious
attack)?
2
We Need Secure Forensics

For network routing …
 Example: incident in March 2010


Traffic from Capital Hill got redirected
… but also for other application scenarios
 Distributed hash table: Eclipse attack
 Cloud computing: misbehaving machines
 Online multi-player gaming: cheating

Goal: secure forensics in adversarial scenarios
3
Ideal Solution
Route r2
Q: Explain
theto
Why
did mywhy
route
route
to foo.com
foo.com
change?!
changed to r2.
D
E
A
Alice
foo.com
C
B
The Network
A: Because someone accessed
Router D and changed the
configuration from X to Y.

Not realistic: adversary can tell lies
4
Challenge: Adversaries Can Lie
Everything is fine. Router
I should cover up
E advertised a new route.
the intrusion.
Route r2
Q: Explain why the
route to foo.com
changed to r2.
D
E
A
Alice
foo.com
C
B
The Network

Problem: adversary can …


... fabricate plausible (yet incorrect) response
… point accusation towards innocent nodes
5
Existing Solutions

Existing systems assume trusted components
 Trusted OS kernel, monitor, or hardware

E.g. Backtracker [OSDI 06], PASS [USENIX ATC 06], ReVirt [OSDI 02],
A2M [SOSP 07]
 These components may have bugs or be compromised
 Are there alternatives that do not require such trust?

Our solution:
 We assume no trusted components;
 Adversary has full control over an arbitrary subset of the
network (Byzantine faults).
6
Ideal Guarantees


Fundamentally
impossible
Ideally: explanation is always complete and accurate
Fundamental limitations
 E.g. Faulty nodes secretly exchange messages
 E.g. Faulty nodes communicate outside the system

What guarantees can we provide?
7
Realistic Guarantees
Route r2
Q: Why did my route to
foo.com change to r2?
D
E
A
foo.com
Alice
B
Aha, at least I know which
node is compromised.



The Network
C
A: Because someone accessed
Router D and changed its router
configuration from X to Y.
No faults: Explanation is complete and accurate
Byzantine fault: Explanation identifies at least one faulty node
Formal definitions and proofs in the paper
8
Outline

Goal: A secure forensics system that works in an
adversarial environment







Explains unexpected behavior
No faults: explanation is complete and accurate
Byzantine fault: exposes at least one faulty node with evidence
Model: Secure Network Provenance
Tamper-evident Maintenance and Processing
Evaluation
Conclusion
9
Provenance as Explanations
route(D, foo.com)
D
link(D, E)
route(E, foo.com)
E
link(E, B)
foo.com
route(A, foo.com)
A
Alice
route(A, B)
link(A, B)
route(A, D)
……
B

route(B, foo.com)
link(B, C)
C
route(C, foo.com)
link(C, foo.com)
Origin: data provenance in databases
 Explains the derivation of tuples (ExSPAN [SIGMOD 10])
 Captures the dependencies between tuples as a graph
 “Explanation” of a tuple is a tree rooted at the tuple
10
Secure Network Provenance
route(A, foo.com)
link(A, B)
route(B, foo.com)
link(B, C)

route(C, foo.com)
link(C, foo.com)
Challenge #1. Handle past and transient behavior
 Traditional data provenance targets current, stable state
 What if the system never converges?
 What if the state no longer exists?
11
Secure Network Provenance
Time = t1
Time = t2
Time = t3
route(A, foo.com)
route(C, foo.com)
link(C, foo.com)
route(B, foo.com)
route(C, foo.com)
link(B, C)
link(A, B)
link(C, foo.com)
route(B, foo.com)
route(C, foo.com)
link(B, C)
link(C, foo.com)
Timeline

Challenge #1. Handle past and transient behavior
 Traditional data provenance targets current, stable state
 What if the system never converges?
 What if the state no longer exists?
 Solution: Add a temporal dimension
12
Secure Network Provenance
Time = t1
Time = t2
Time = t3
route(A,
foo.com)
+route(A,
foo.com)
route(C, foo.com)
link(C, foo.com)
route(B, foo.com)
route(C, foo.com)
link(B, C)
link(A, B)
link(C, foo.com)
route(B,
+route(B,
foo.com)
foo.com)
route(C, foo.com)
+route(C,
foo.com)
link(B, C)
link(C, foo.com)
+link(C,
foo.com)
Timeline

Challenge #2. Explain changes, not just state
 Traditional data provenance targets system state
 Often more useful to ask why a tuple (dis)appeared
 Solution: Include “deltas” in provenance
13
Secure Network Provenance
route(D, foo.com)
link(D, E)
route(E, foo.com)
link(E, B)
route(A, foo.com)
link(A, B)
route(B, foo.com)
link(B, C)

route(C, foo.com)
link(C, foo.com)
Challenge #3. Partition and secure provenance
 A trusted node would be ideal, but we don’t have one
 Need to partition the graph among the nodes themselves
 Prevent nodes from altering the graph
14
Partitioning the Provenance Graph
route(D, foo.com)
link(D, E)
route(E, foo.com)
link(E, B)
route(A, foo.com)
link(A, B)
RECV
SEND
route(B, foo.com)
link(B, C)

route(C, foo.com)
link(C, foo.com)
Step 1: Each node keeps vertices about local actions
 Split cross-node communications

Step 2: Make the graph tamper-evident
15
Securing Cross-Node Edges
Signed
commitment
from B
Signed
ACK
from A
RECEIVE
SEND
RECV
Router A

SEND
Router B
Step 1: Each node keeps vertices about local actions
 Split cross-node communications

Step 2: Make the graph tamper-evident
 Secure cross-node edges (evidence of omissions)
16
Outline

Goal: A secure forensics system that works in an
adversarial environment







Explains unexpected behavior
No faults: explanation is complete and accurate
Byzantine fault: exposes at least one faulty node with evidence
Model: Secure Network Provenance
Tamper-evident Maintenance and Processing
Evaluation
Conclusion
17
System Overview
Users
Primary system
Provenance system
Operator
Extract provenance
Maintain provenance
Query provenance
Query engine
Application
Maintenance
engine
Network


Stand-alone provenance system
On-demand provenance reconstruction
 Provenance graph can be huge (with temporal dimension)
 Rebuild only the parts needed to answer a query
18
Extracting Dependencies

Option 1: Inferred provenance
 Declarative specifications explicitly capture provenance
 E.g. Declarative networking, SQL queries, etc.

Option 2: Reported provenance
 Modified source code reports provenance

Option 3: External specification
 Defined on observed I/Os of a black-box system
19
Secure Provenance Maintenance

Maintain sufficient information for reconstruction
 I/O and non-deterministic events are sufficient
 Logs are maintained using tamper-evident logging

Based on ideas from PeerReview [SOSP 07]
E
D
foo.com
Alice
A
……
RECV
ACK
B
C
……
SEND
RCV-ACK
20
Secure Provenance Querying

Recursively construct the provenance graph
 Retrieve secure logs from remote nodes
 Check for tampering, omission, and equivocation
 Replay the log to regenerate the provenance graph
E
D
Alice
A
Explain the route
from A to foo.com.
foo.com
route(A, foo.com)
RECV (from B)
link(A, B)
B
C
21
Secure Provenance Querying

Recursively construct the provenance graph
 Retrieve secure logs from remote nodes
 Check for tampering, omission, and equivocation
 Replay the log to regenerate the provenance graph
E
D
foo.com
route(A, foo.com)
Alice
A
link(A, B)
route(B, foo.com)
RECV (from C)
B
link(B, C)
C
22
Secure Provenance Querying

Recursively construct the provenance graph
 Retrieve secure logs from remote nodes
 Check for tampering, omission, and equivocation
 Replay the log to regenerate the provenance graph
E
D
foo.com
route(A, foo.com)
Alice
A
OK. Now I know
how the route
was derived.
link(A, B)
route(B, foo.com)
B
link(B, C)
C
route(C, foo.com)
link(C, foo.com)
23
Outline

Goal: A secure forensics system that works in an
adversarial environment







Explains unexpected behavior
No faults: explanation is complete and accurate
Byzantine fault: exposes at least one faulty node with evidence
Model: Secure Network Provenance
Tamper-evident Maintenance and Processing
Evaluation
Conclusion
24
Evaluation Results

Prototype implementation (SNooPy)
 How useful is SNP? Is it applicable to different systems?
 How expensive is SNP at runtime?
Traffic overhead, storage cost, additional CPU overhead?
 Does SNP affect scalability?
 What is the querying performance?
 Per-query traffic overhead?
 Turnaround time for each query?

25
Usability: Applications

We evaluated SNooPy with
 Quagga BGP: RouteView (external specification)
Explains oscillation caused by router misconfiguration
 Hadoop MapReduce: (reported provenance)
 Detects a tampered Mapper that returns inaccurate results
 Declarative Chord DHT: (inferred provenance)
 Detects an Eclipse attacker that always returns its own ID


SNooPy solves problems reported in existing work
26
Runtime Overhead: Storage
Over 50% of the overhead
was due to signatures and
acks. Batching messages
would help.

Manageable storage overhead
 One week of data: E.g. Quagga – 7.3GB; Chord – 665MB
27
Query Latency
largely due to
replaying logs
Verification
Replay
Download
dominated by
verifying logs
and snapshots


Query latency varies from application to application
Reasonable overhead
28
Summary

Secure network provenance in untrusted environments
 Requires no trusted components
 Strong guarantees even in the presence of Byzantine faults

Formal proof in a technical report
 Significantly extends traditional provenance model

Past and transient state, provenance of change, …
 Efficient storage: reconstructs provenance graph on demand
 Application-independent

(Quagga, Hadoop, and Chord)
Questions?
Project website: http://snp.cis.upenn.edu/
29
Download