Enhancing Availability and Security Through Failure-Oblivious Computing Martin Rinard, Cristian Cadar, Daniel

advertisement
Enhancing Availability and
Security Through
Failure-Oblivious Computing
Martin Rinard, Cristian Cadar, Daniel
Dumitran, Daniel Roy,
and William Beebee, Jr.
Introduction

Memory errors are a common source of
program failures

ML and Java use dynamic checks to eliminate
such errors

Assumption:

Invalid memory access  unsafe to continue the execution
Failure-Oblivious Computing

Instead of throwing an exception or
terminating


Ignores any memory access errors and continue
Read (an out of bounds array element)


Just read a manufactured value
Write (an out of bounds array element)

Discard the value
Wrong Results?

Many programs can continue to run


As long as errors do not corrupt the program’s
address space or data structures
Failure-oblivious computing can improve the
availability, robustness, and security of such
programs
Shouldn’t We Stop at the First Error?

Debugging may not be an option



No source code
Not enough time
Failure-oblivious computing can still provide
acceptable service

Better than no service
Servers and Buffer-overrun Attacks

When a program allocates a fixed-size buffer


Then fails to check if input string fits in the buffer
A long input string containing executable code can
overwrites the stack contents

Can coerce the server into running arbitrary code
Servers and Buffer-overrun Attacks

Failure-oblivious computing discards the
excess characters, preserving the integrity of
the stack


Server detects invalid request and returns an error
Converts a dangerous attack into an invalid input
Multiple Items or Outputs


Many programs (e.g. mail readers) process
multiple items
Some applications generate multiple outputs


Some outputs are more important than others
Without failure-oblivious computing

Failure to process one can prevent the program to
process the rest
Benefits and Drawbacks
+ Increased resilience

Graceful degradation and continue to operate
successfully on most of its inputs
+ Increased security

Can survive stack overruns
+ Reduced development costs

Pressured to find and eliminate all disruptive bugs
+ Reduced administration overhead

Reduce the success rate of attacks
Benefits and Drawbacks
+ Safer integration

Lowers the risks to use foreign components
- May generate unacceptable results


Inevitable consequence for better resiliency
Need to convert unanticipated states into
anticipated error states
Scope

Interactive computing environments






Mailers
Servers
System administration tools
Operating systems
Document processing systems
Mission critical applications

Halting is not an option
Scope

Less appropriate for programs


Hard to determine whether the output is correct
Safety-critical applications

Safer to terminate the computation
Example

A Mutt procedure




With standard compilers


Takes an input string
Returns an encoded output string
Fails to allocate sufficient space
Writes succeed, corrupt the address space, and
program segfaults
With safe-C compilers

Mutt exits before presenting the GUI
Example

With the failure-oblivious compiler



The returned string is incorrect
Server responds with an error
Failure oblivious approach works

Mostly correct programs

With subtle errors
Implementation

Failure oblivious compiler

Generate two kinds of additional code

Checking code



Discard erroneous writes
Manufactures values for erroneous reads
Continuation code

Executes when checking code detects an attempt to
perform illegal access
Checking Code

Jones and Kelly’s Scheme


Track the locations to structs, arrays, variables
Each data item is padded with an extra byte


Initialized to ILLEGAL
Check the status of each pointer before
dereferencing it
Continuation Code

Write continuation code


Discards the value
Read continuation code

Redirects the read to a preallocated buffer of
values


Iterates through all small integers
Increasing the chance to exit loops


To avoid nontermination
Mostly 0s and 1s
Continuation Code

Optional logging


Can be used to track down errors
Failure-oblivious computing

Can also reduce the incentive to eliminate errors
Case Studies

Recompiled widely-used open-source
programs with known memory errors







Pine (mail user agent)
Midnight commander (file manager)
Sendmail (mail transfer agent)
Mutt (mail user agent)
Samba (file server)
WsMp3 (mp3 server)
Apache (http server)
Methodology

Compare each program compiled differently




By a standard C compiler
By the CRED safe-C compiler
By the failure-oblivious compiler
Workloads

Contain inputs that exploit known security
vulnerabilities
Pine 4.44

Fails to parse certain legal From fields




Possible to execute arbitrary code
Standard version: crashed
Safe version: terminated with an error
Failure oblivious version: continued to run

Was able to forward the read and forward the
message with the problematic From field
Midnight Commander




Problems with symbolic links in tgz files
Standard version: segfaulted
Safe version: terminated with an error
message
Failure-oblivious version: continued to run
Sendmail 8.11.6




Allows root privilege to execute arbitrary code
on the machine running the Sendmail server
Standard version: vulnerable to an attack to
gain the root shell
Safe version: exited with an error message
Failure-oblivious version: not vulnerable to
the attack
Mutt 1.4




Memory error in the conversion from UTF-8
to UTF-7 string formats
Standard version: crashed
Safe version: exited with an error message
Failure oblivious version: continued to run


6x slow down
Took about 1 second to load 3,000 messages
Samba 2.2.5

Memory corruption error



Standard version: vulnerable to an attack to
gain the root shell
Safe version: functional until the attack


A remote user can obtain the root shell
The child process exited
Failure oblivious version: continued to run

Similar performance compared to the safe version
WsMp3 0.0.5



Memory-error vulnerability
Standard version: segfaulted
Safe version: crashed the entire server


Single threaded
Failure-oblivious version: survived the attack
Apache 2.0.47




mod_alias contains a memory-error
vulnerability
Standard version: child process segfaulted
Safe version: child process exited properly
Failure-oblivious version: child process
redirected the attacking request to a
nonexistent URL

The child process stayed alive and processed
subsequent requests correctly
Gzip 1.2.4a

Memory error in its file name processing code


Standard version: segfaulted



An attacker can run arbitrary code
Remaining files were not processed
Safe version: exited at the problematic file
Failure-oblivious version: prompted an error
message for the problematic files


Proceeded to process all remaining files
10x slow down (1.2 MB/sec)
Discussion

Failure oblivious versions survived all
memory-corruption attempts

Work well for this class of applications




One input has a minimal effect on the next input
Unless it corrupts the data structures or address space
Little performance degradation for interactive
programs
Safe versions are prone to DoS attacks

Tend to terminate prematurely
Related Work

Any safe-C compiler can be modified to
implement a failure-oblivious compiler



Discard writes
Manufacture values for unsafe reads
Typically < 2x slow down


Occasionally 8x slow down
Does not perceptibly degrade the response times
of interactive programs

Also I/O-bound programs
Safe Languages

Java and ML

Modify the exception handling code


Discard illegal writes
Return manufactured values for illegal reads
Traditional Error Recovery

Traditional approaches





Reboot
Checkpointing
Partial system restarts
Hardware redundancy
Failure-oblivious computing reduces down
time and vulnerabilities to persistent errors

Restarting Pine will not solve the problem
Other Approaches

Data structure repair


Statically detect all buffer-overrun errors


Failure-oblivious approach is preventive
May conservatively reject almost working code
Buffer-overrun detection tools



Detect overwriting the return address
Detect overwriting function pointers
Failure-oblivious approach prevents the attack
from corrupting the address space
Conclusion

Failure-oblivious computation enhances
availability, resilience, and security

Converts dangerous unknown system states to
known error cases
Download