Genesis: Software Diversity

advertisement
GENESIS: A Framework For
Achieving Component Diversity
John C. Knight, Jack W. Davidson,
David Evans, Anh Nguyen-Tuong
University of Virginia
Chenxi Wang
Carnegie Mellon University
Nice Meeting Facility!
DARPA SRS Kickoff
2
What Is The Problem?


Many machines with the same vulnerability
What is a vulnerability?



A vulnerability is a fault in the classic sense of
dependability theory
Fault types:

Degradation

Design
something breaks in
one copy
flaw in design affects
all copies
Software faults are design faults
DARPA SRS Kickoff
3
Redundancy & Degradation Faults
Identical Computers
Computer1
Inputs
Computer2
Damage Assessment
Error Detection
Voter
Outputs
State Restoration
Continued Service
ComputerN
N Modular Redundant
(NMR) System
DARPA SRS Kickoff
4
Redundancy & Design Faults


Redundancy is diversity
Works well for degradation faults:





Faults have predictable statistical behavior
Effective mathematical models available
What about design faults?
Simple replication doesn’t work, obviously
Requires different (diverse) designs to be
effective
DARPA SRS Kickoff
5
Multiple Systems
Vulnerabilities
Linux
Windows
OS/2
Specification
DARPA SRS Kickoff
6
Design Diversity Development
Interaction
Barriers
Technology
Restrictions
Version Development 1
Version Development 2
Component
Specification
System
Assembly
Version Development N
Goal: Different Faults Because Of Independent Development
DARPA SRS Kickoff
7
Design Diverse System
How “Different”?
Version1
Inputs
Version2
N Version
System
Voter
Outputs
VersionN
Assumption: Different Faults Because Of Independent Development
DARPA SRS Kickoff
8
Design Diversity

Does not work well for design faults






No
No
No
No
upper bound on failure probability
practical statistical models
definition of “design diversity”
procedure for achieving it
Linux vs. Windows is, however, worse—it is
purely ad hoc
But, what else is there?
DARPA SRS Kickoff
9
DARPA SRS Kickoff
10
Data Diversity

Heisenbug (Jim Gray):






Program fails
Sometimes if you rerun the program, it works
Applied to Tandem operating system
We all do this in daily operation
Several variants of approach developed
Comprehensive, general approach developed:

Data diversity
DARPA SRS Kickoff
11
Data Diverse System
N Copy Architecture
Same Software
Inputs
Data
Reexpression
Copy1
Reverse Data
Reexpression
Data
Reexpression
Copy2
Reverse Data
Reexpression
Data
Reexpression
CopyN
Reverse Data
Reexpression
DARPA SRS Kickoff
Voter
12
Data Diversity





Low cost—software is copied
Unknown performance for design faults
Experimental evidence that it works well
Can be very powerful:
sin(x)=
sin(a + b)
=
sin(a)cos(b) + cos(a)sin(b)
=
sin(a)sin(90-b) +
sin(90-a)sin(b)
Choose a and b, repeat, vote
DARPA SRS Kickoff
13
The Vision

Diversity Specifications
GENESIS Diversity Engine
Software

Diverse population of functionallyequivalent software
Automated production
of design-diverse,
functionally-equivalent
software
Automatic production
of data-diverse,
functionally-equivalent
software
It might work…
DARPA SRS Kickoff
14
Overall Approach


Analysis of the diversity space
Automated production of functionally-equivalent
software and data:

Compiler and meta-compiler technology:




Diversity Specifications
GENESIS Diversity Engine
Software
Virtual Machine Technology


Source-level transformations
Compiler transformations
Data stream rewriting
Run-time software translation techniques
Rationale that diversity is an effective
defense mechanism:



Diverse population of functionallyequivalent software
Experimental evaluation
Modeling of effects of diversity on known vulnerabilities
Application to COTS software
DARPA SRS Kickoff
15
Hierarchic Design Diversity
Source-toSource
Transformations
Software Application
Compiler
Transformations
Source Code
Version N
Source Code
Version 1
Binary
1
Binary
2
2
2
2
Binary
i
2
2
1
1
Binary
i
Binary
i
2
1
2
1
1
2
2
1
Binary
i
2
2
1
2
2
Run-time Transformations
DARPA SRS Kickoff
16
Source to Source Transformations

Underlying model of tasks:


Process interaction:


e.g. low-level semaphores vs. higher-level
monitors
Fundamental libraries:


e.g. fork/execs vs. threads
e.g. libc, sockets, etc…
Diversity achieved by component
combinations
DARPA SRS Kickoff
17
Compiler Transformations


Generate N compilers that target different
architectures
Manipulate formal description of target
architecture—Computer Systems Description
Language (CSDL):



Instruction Set Architecture (ISA) specification
Calling convention specification
Example diversity techniques:




Different calling conventions
ISA subsets created, enforced dynamically
Memory layouts—code and data
Implement the above within the same program
DARPA SRS Kickoff
18
Run-time Transformations


Software Dynamic Translation
STRATA system:



Layer between hardware and application
Designed to be easily retargeted
Virtual machine provides:
Underlying target
 Supplementary rules on use of target
Software Dynamic Translation systems:
 FX 32
 Dynamo
 Transmeta


DARPA SRS Kickoff
19
STRATA—Basic Operation
SDT Virtual Machine
Context
Capture
New
PC
Cached?
New
Fragment
Enforce Desired
Policies
Yes
Fetch
Decode
Translate
Context
Switch
Finished?
Next PC
Yes
No
Host CPU (Executing Translated Code from Cache)
DARPA SRS Kickoff
20
Example STRATA Policies

Apply compile-time transformations dynamically:


Dynamic injection and enforcement of behavioral
policies


Rearrangement basic blocks, calling sequence
transformations, etc…
E.g. resource usage (files, sockets, tasks)
Language diversity: dialects


Only allow subsets of original instruction set
Vary subsets dynamically
DARPA SRS Kickoff
21
STRATA System Architecture
Application
Context Management
Memory Management
Machine
Independent
Components
Strata Virtual CPU
Cache Management
Target Interface
Strata Virtual
Machine
Linker
Target Specific Functions
Host CPU
DARPA SRS Kickoff
22


Diversity in the data space
can avoid sequences of
events that lead to failure
Diversity space offers large
range of data re-expression
options



Precision (Exact, Approximate)
Locality (Internal, External)
Sequence (inorder-ontime,
inorder-offtime, outoforderontime, outoforder-offtime)
DARPA SRS Kickoff
Sequence
Data Diversity
c
o
L
ty
i
l
a
Precision
23
Data Re-expression Examples

Change floating point values:




Data sequences:







Lose precision
Translate
Rotate
Reorder data
Change timing of data
Memory layout (code and data)
Reorder transactions
Reorder data in activation records
SQL Rewriting
…many more examples…
DARPA SRS Kickoff
24
Data Re-expression Space



These examples are ad hoc
Proposals in literature are ad hoc
So:
Use data re-expression space categorization
to drive exploration of diversity techniques
(instead of point solutions)
DARPA SRS Kickoff
25
Evaluation

Theoretical:

Modeling of effects of diversity on network vulnerabilities




Understand limits of diversity
Categorization of “diversity space”
Identify unnecessary homogeneity in software


E.g., WORM propagation
Not just code but also environment, configuration, etc…
Experimental:

Directed fault seeding:




Apply known exploits to target system
Apply all Genesis techniques
Evaluate variants’ resistance to attack
Automated fault seeding
DARPA SRS Kickoff
26
Automatic Fault Seeding




Need test cases
Need typical vulnerabilities, i.e., bugs
Can typical bugs be synthesized?
Prior work on syntactic transformations:




Simple mutations
Wide variety of resilience
Defects created with excellent statistical
properties
Plan to try this route
DARPA SRS Kickoff
27
Automated Fault Seeding
Target
Software
System
Target
Target
Software
Target
Target
Software
Target
Target
System
Software
Software
Error
Acceptance
Target
Target
System
Software
Software
Target
Target
System
System
Seeding
Tests
Software
Software
Target
System
System
Software
Software
System
System
Software
System
System
System
Genesis
Transformations
Vulnerability
Assessment
Target
Target
Software
Target
Software
Target
System
Software
System
Software
System
System
Target
Target
Software
Target
Software
Target
System
Software
System
Software
System
System
DARPA SRS Kickoff
Target
Target
Software
Target
Software
Target
System
Software
System
Software
System
System
28
State Of The Implementation

Exists, ready to use:



CSDL
Calling convention spec
STRATA
DARPA SRS Kickoff
29
Specific Questions Posed

What you are trying to do (the problem you are addressing)?

How will you show that you were successful?

What are the implications of successful results (or less than successful
results)?

What is your technical approach?

What is new, or hasn’t been attempted?

What significant problems do you anticipate, what makes your project
difficult and how do you plan to approach the difficulties?

If successful, what have you thought about regarding transitioning the
technology?

If successful, what would be next?
DARPA SRS Kickoff
30
Practical Problem

If this works:





Building a system will require lots of computer
time
Lots of systems will require LOTS of computer
time
But it is just computer time
Will not be able to just press CDs
Will require a substantial engineering
investment
DARPA SRS Kickoff
31
Summary

Automatic application of design diversity:


Systematic application of data diversity:



Macro, midi, micro
Internal, external, all dimensions
Seamless integration of the two
Evaluation and assessment:


Directed fault seeding
Automated fault seeding
Questions?
DARPA SRS Kickoff
32
Download