Robust Cyber Defense Fred B. Schneider Research to Support

advertisement
Research to Support
Robust Cyber Defense
Fred B. Schneider
Study commissioned for Dr. Jay Lala
DARPA
Information Technology Office
Study Committee
Jim Anderson, University of North Carolina
Stephanie Forrest, University of New Mexico
Carl Landwehr, National Science Foundation
Teresa Lunt, Palo Alto Research Center
Mike Reiter, Carnegie-Mellon University
Fred B. Schneider, Cornell University (chairman)
Kishor Trivedi, Duke University
1
Study Process









Two meetings in Washington, DC
Briefings from subject-matter experts
Tarek Abdelzaher, Univ Virginia
Massoud Amin, EPRI
Anish Arora, Ohio State Univ
Steve Bellovin, ATT
Ken Birman, Cornell Univ
Alan Demers, Cornell Univ
Steve Goddard, Univ Nebraska







Mohamed Gouda, Univ Texas
Ted Herman, Univ Iowa
Erica Jen, Santa Fe Institute
Chandra Kintala, Avaya
Simon Levin, Princeton Univ
Alfred Spector, IBM Rsch
Wietse Veneme, IBM Rsch
2
Study Goals
Identify research areas to enable the design and
implementation of networked computer systems that
tolerate attacks and failures by automatically changing
state or structure during execution.
Strategy for defense:


[Prior]: Prevention eliminates vulnerabilities.
[Near term]: Render ongoing attacks ineffective
through dynamic changes to state.

[Longer term]: Alter vulnerabilities (viz co-evolution).

[Eventually]: Self-repair of identified problems.
3
Presentation Outline




Where is industry heading.
A characterization of robustness.
New points of leverage.
Complementary research.
4
Industry Context: IBM
IBM perceived customer concerns:
–
Total Cost of Ownership.
Solution: self-managing / self configuring systems
–
Emphasis on Quality of Service.
Solution: self-optimizing / scalability + isolation
–
Flexibility to deploy new applications.
New IBM initiative: autonomic computing
Not concerned with Byzantine failures or highly
malicious attacks.
5
Industry Context: Microsoft
Microsoft perceived customer concerns:
– Total Cost of Ownership.
Solution: automatic patch and upgrade
– Harness the network.
Solution: interoperability and transparency
– Security
[Bill Gates internal memo on “Trustworthy Computing”,
approx Jan 16, 2002]
Trade assurance for bugs and complexity (?)
New Microsoft initiative: .NET
Not concerned with Byzantine failures or highly
malicious attacks.
6
Industry Context: Power Grid
EPRI perceived concerns:
– Reliability of grid
Propagation of effect.
Operation with reduced capacity cushion.
– Move to decentralized, market-based control.


Separate delivery channel and control channels.
Little concern about about hostile attacks (either
in delivery channel or control channel).
7
Summary:
Industry versus DoD Needs
Industry Direction
DoD Needs
malicious
Attacks
random
benign
Failures
Byzantine
8
Addressing DoD Needs:
Dimensions of Robustness
[S. Levin]
Diversity
Robustness =
Redundancy
Modularity
The time is right to exploit new opportunities!
9
Addressing DoD Needs:
New Research Opportunities
– Temporal and spatial run-time diversity.
– Scalable redundancy.
– Self-stabilization.
– Natural robustness via biological metaphors and
systemic effects.
10
Research Thrust:
Run-time Diversity
Limited success to date:

Obtaining diversity manually is expensive.
– Multiplies costs associated with:
design
implementation
test
– Integration and interoperation expensive.

Obtaining diversity automatically has not been
explored aggressively.
– Modern compiler technology could help here.
– Run-time environments also possible leverage points
11
Creating Diversity at Run-time
Run-time diversity is associated with
– randomness -or– non-determinacy.
The impact will depend on where it is applied:
– application level programs
– system level programs
– generation of application/system.
12
Run-time Diversity in Cryptography
Recent crypto advances introduce:
Spatial diversity: Different components hold
different, but related, secrets.
– Compromising one doesn’t compromise all.
Temporal diversity: Secret state changed
from time to time.
– Limits adversary’s abilities after compromises.
13
Example of Spatial Diversity in Cryptography:
Function Sharing
Public K / private k = [s1, s2, s3, s4]
m
s1
s2
s3
pr1
pr2
pr3
s4
server
service
sig  combine({pr1, pr2, pr3})
verify(K, m, sig) succeeds
14
Example of Temporal Diversity in Cryptography:
Forward-Secure Signatures
Time period
Private key
Public key
i
ki
K
ki+1
K
(roll forward)
i+1
verify(K, m, i+1, sign(ki+1, m)) succeeds
verify(K, m, i, sign(ki+1, m)) fails
15
Example of Spatial and Temporal Diversity:
Proactive Function Sharing
Public K / private k = [s1, s2, s3, s4]
service
s1
s2
s3
s4
t1
t2
t3
t4
server
Public K / private k = [t1, t2, t3, t4]
16
Run-time Diversity in Cryptography:
Next Steps


Deploy principles of crypto run-time diversity
(both spatial and temporal) in the construction
of distributed services.
Leverage existing crypto diversity more broadly:
Practical multi-party computation?
(= “Spread-spectrum” computing.)
17
Research Thrust:
Scalable Redundancy

Redundancy has been widely studied as a
method to achieve fault tolerance:
– Replication of servers
– Redundant routing

The key problem now is scalability.
18
Scalable Redundancy:
Central Challenge
Scalable methods for handling redundancy provide
new—often weaker—types of guarantees:
– Probabilistic
– Eventual consistency
– Monotonic convergence
How to build systems with these new guarantees?
– Transform weak guarantees into stronger ones?
– Settle for combinations of the new guarantees?
19
Example of Scalable Redundancy:
Epidemic and Gossip Protocols
Key characteristic: Information exchanges involve
randomly or opportunistically chosen gossip
partners.

Resulting protocols are:
– fault-tolerant
– scalable, and
– self-organizing

The few actual deployments are promising:
–
–
–
–
Xerox PARC Clearinghouse Replicated Database
MIT Lazy Replication
Xerox Bayou database system
Astrolabe distributed spreadsheet
20
Example of Scalable Redundancy:
Quorum Systems
quorum
quorum
Key characteristic: Operations access quorums
of servers. Quorums can be a subset of all
servers.
21
Scalable Redundancy:
Next Steps


Accommodate weaker properties of scalable
redundancy technologies in higher-level apps.
Use realistic network topologies:
– Irregularity in interconnection.
– Clustering and non-uniform link bandwidths.

Understand and exploit interactions with QoS:
– Implement QoS guarantees using gossip protocols.
– Leverage existing QoS guarantees in gossip protocols.

Understand and exploit threshold phenomena.
22
Research Thrust:
Self-Stabilization
Key characteristic: System eventually transitions to
normal operating states in response to arbitrary
transitions (to arbitrary states).
fault/attack
bad
good
Self-stabilization expands
the diversity of states
from which a system can
operate.
More states:
 Fewer assumptions.
 Fewer vulnerabilities.
23
Self-Stabilization:
Hallmarks of Systems


Highly decentralized: Convergence is an
“emergent property” and error states are
tolerated without being detected.
Forgetful: State is regenerated; old state is
forgotten.
24
Self-Stabilization:
Promise of Success

The few actual deployments are promising:
– SUN’s Netra Proxy Server
– MS Research Aladdin Lookup Service
– DEC/Compaq Autonet Configuration Protocols

Self-stabilization well suited to network protocols,
where transient disruptions are already tolerated by
upper system levels.
25
Self-Stabilization:
Next Steps

How might self-stabilization be extended?
Convergence from only some configurations.
Distinguish state components (e.g. keys, secrets, models of
reality) and have only some converge.

Scalability?
System size, convergence time, severity of transient.

Dimensions of containment:
Space: bound infection / contamination.
Time: speed for convergence.
Safety: how badly is function degraded during repair.

Composition and control:
Go beyond control structure to abstract data types, etc.
Develop basis for compositional construction.
26
Research Thrust:
Natural Robustness
Biological and other robustness metaphors…
– Work at multiple levels:
Time scale (lifetime of organism vs species).
Structure (cell vs organism vs eco-system).
– Hallmarks of such robustness:
Robustness at one level translates into robustness at a
different level.
Highly decentralized: Convergence is an “emergent
property.”
Widespread use of diversity.
Adaptive and always evolving.
Use disposable components.
27
Natural Robustness:
Leveraging Systemic Effects
Natural robustness gains much from
systemic effects. So can we.
– Epidemiology
Logarithmic delays
– Percolation theory
Critical point phenomena
Bimodal behaviors
– Graph theory
Small-world phenomena
28
Natural Robustness:
Promise of Success

The few actual deployments are promising:
– Artificial immunology applied to cyber-security, robotics, and data
mining.

Convergence: biology  computing
– Trends in computing have biological interpretations:
Software Rejuvenation (e.g. Apache web server).
– Biology making greater use of computing:
Gene-expression analysis, phylogenetic tree reconstruction, cell
signaling models, minimal cell project, smart matter.
29
Natural Robustness:
Next Steps (1)

Pair new results from biology with robustness challenges
in computer networks.
– Exploit information about software evolution.
E.g., Phylogenetic trees for predicting vulnerabilities.
– Intra-cellular signaling and cascades (chemostaxis).
– Inter-cellular signaling networks (e.g., immune systems).
– Genetics:
Genetic buffering.
Individual gene repairs.
Evolutionary mechanisms (genotype/phenotype mappings).
– Ecosystem modeling:
Diversity, keystone species, patch models, allometry, resource
flows.
30
Natural Robustness:
Next Steps (2)

Further utilize systemic effects in
networked systems:
– Epidemic and gossip protocols.
– Survivability of computer networks.
– Propagation of power failures in electrical
grids.
– Epidemiological approaches to computer
viruses.
31
Robust Cyber Defense:
Complementary research (1)

Support for on-the-fly system change:
–
–
–
–

Software rejuvenation (refresh data or environment)
Control structure/data rep change
Adaptive fault-tolerance (ftol asmpt change)
Self-healing real-time schedulers
Enhanced detection:
– Growing memory size, enables rollback to a previous
state
– Application-specific monitoring
32
Robust Cyber Defense
Complementary Research (2)

Machine learning
– Reinforcement learning (to adjust parameters
in accordance with new information or
feedback).
– Genetic programming (to evolve small
software components).
33
Download