Ben Livshits Based in part of Stanford class slides from

advertisement
Ben Livshits
Based in part of Stanford class slides from
http://www.stanford.edu/class/cs295/
and slides from Ben Zorn’s talk slides
• Lecture 1:
Introduction to static analysis
• Lecture 2:
Introduction to runtime
analysis
• Lecture 3:
analysis
Applications of static and runtime
reliability & bug finding
performance
security
2
• Purify for finding memory errors [1992]
• Detecting memory leaks with Purify and GC
[1992]
• Detecting dangling pointers & buffer overruns
with DieHard [2006]
• Detecting buffer overruns with StackGard
[1997]
3
• Dangling pointers: If the program mistakenly frees a live object, the allocator
may overwrite its contents with a new object or heap metadata.
• Buffer overflows: Out-of-bound writes can corrupt the contents of live objects
on the heap.
• Heap metadata overwrites: If heap metadata is stored near heap objects, an
out-of-bound write can corrupt it.
• Uninitialized reads: Reading values from newly-allocated or unallocated
memory leads to undefined behavior.
• Invalid frees: Passing illegal addresses to free can corrupt the heap or lead
to undefined behavior.
• Double frees: Repeated calls to free of objects that have already been freed
cause freelist-based allocators to fail.
4
FINDING MEMORY ERRORS IN C/C++ PROGRAMS
5
• C/C++ are not memory-safe
• Neither the compiler nor the runtime system
enforces type abstractions
• What is memory-safe vs. type-safe?
• Possible to read or write outside of your
intended data structure
• Among other bad behaviors
• What else is possible that you can’t do in
Java or Scheme or ML or F#?
6
• Each byte of memory is in one of 3
states:
• Unallocated: cannot be read or written
•Allocated but uninitialized: cannot be
read
• Allocated and initialized: anything goes
7
• Check the state of each byte on each access
• Binary instrumentation
• Add code before each load and store
• Represent states as giant array
• 2 bits per byte of memory
• What is the memory overhead?
• 25%!!
• Catches byte-level errors
• Won’t catch bit-level errors
8
• We can only detect bad accesses if they
are to unallocated or uninitialized memory
• Try to make all bad accesses be of those two
forms
• We can make this part of our custom memory
allocator
9
• Red Zones
• Leave buffer space
between allocated
objects that is never
allocated
• Guarantees that
walking off the end
of an array accesses
unallocated memory
• Aging Freed
Memory
• When memory is
freed, do not
reallocate
immediately
• Helps catch
dangling pointer
errors
10
• One of the first
commercially
successful
runtime tools
• Was an
independent
company that got
bought by IBM
Rational
• Overhead can
vary from 25% to
40x
11
• This is where
buffer overruns
come from!
• Why can’t we
catch them?
12
PROGRESS & OPEN PROBLEMS
13
• Memory leaks are at least as serious as memory
corruption errors
• Also very difficult to find
• Manifest only over hours, days, weeks
• Often persist in production code
• Managed languages such as Java and C# don’t really
help
14
• We can find many memory leaks using
techniques borrowed from garbage collection
• Any memory with no pointers to it is leaked
• There is no way to free this memory
• Run a garbage collector
• But don’t free any garbage
• Just detect the garbage
• Any inaccessible memory is leaked memory
• Can we do this in C/C++ at all? Sort of…
15
• It is sometimes hard to tell what is
accessible in a C/C++ program?
• Cases
• No pointers to a malloc’d block: definitely garbage
• No pointers to head of a malloc’d block: maybe
garbage
• Pointers to the head of a malloc’d block: not garbage
by usual definition
16
• From time to time, run a garbage collector
• Use mark and sweep
• Report areas of memory that are definitely or probably
garbage
• No type safety ==> no memory safety
• Is this as easy as in Java?
• Bookkeeping
• Need to report who malloc’d the blocks originally
• Store this information in the red zone between objects
• Used in Purify, but watch out for memory overhead
17
• A Limitation
• Only finds leaks to unreachable objects
• Doesn’t cover leaks in languages with GC
• Retaining data structures longer than needed
• In practice, also a serious source of leaks, especially in
Java . . .
18
• Look for objects not
accessed for a “long
time”. For each object
• Track it from the moment it is
allocated
• Record the time of the last
access (read or write)
• Discard information when
object is de-allocated
• Periodically
• Scan all objects
• Warn about objects unused
for a “long time”
19
TOLERATING MEMORY ERRORS AT RUNTIME
20
• Buffer overflow
c
char *c = malloc(100);
c[101] = ‘a’;
a
0
99
• Dangling reference
char *p1 = malloc(100);
char *p2 = p1;
free(p1);
p2[0] = ‘x’;
p1
p2
x
0
99
22
• Increase robustness of
installed code base
• Potentially improve millions
of lines of code
• Minimize effort – ideally no
source mods, no
recompilation
• Reduce requirement to
patch
• Patches are expensive
(detect, write, deploy)
• Patches may introduce new
errors
• Trade resources for
robustness
• E.g., more memory implies
higher reliability
• Make deployment easy
• Change the allocator DLL, no
changes to code needed
• Make existing programs
more fault tolerant
• Define semantics of
programs with errors
• Programs complete with
correct result despite errors
23
• Emery D. Berger and Benjamin G. Zorn, "DieHard: Probabilistic Memory
Safety for Unsafe Languages", PLDI’06
• DieHard: correct execution in face of errors with high
probability
• Plug-compatible replacement for malloc/free in C lib
• Define “infinite heap semantics”
• Programs execute as if each object allocated with unbounded memory
• All frees ignored
• Approximating infinite heaps: 3 key ideas
1.
2.
3.
Overprovisioning
Randomization
Replication
• Allows analytic/probabilistic reasoning about safety
24
Expand size requests by a factor of M (e.g., M=2)
1
2
3
1
4
5
2
Pr(write corrupts) = ½ ?
3
4
5
Randomize object placement
4
2
3
1
5
Pr(write corrupts) = ½ !
25
Replicate process with different randomization seeds
P1
1
3
2
5
4
P2
input
4
3
1
5
2
P3
5
2
1
4
Broadcast input to all replicas
3
Voter
Compare outputs of replicas, kill when replica disagrees
26
• Allocation
• Segregate objects by size
(log2), bitmap allocator
• Within size class, place
objects randomly in address
space
• De-allocation
• Expansion factor =>
frees deferred
• Extra checks for illegal
free
• Separate metadata from user
data
• Fill objects with random
values – for detecting
uninitialized reads
27
Runtime on Windows
malloc
DieHard
1.4
Normalized runtime
1.2
1
0.8
0.6
0.4
0.2
0
cfrac
espresso
lindsay
p2c
roboop
Geo. Mean
28
• Synthetic:
• Tolerates high rate of synthetically injected errors in
SPEC programs
• Spec benchmarks:
• Detected two previously unreported benign bugs
(197.parser and espresso)
• Avoiding real errors:
• Successfully hides buffer overflow error in Squid web
cache server (v 2.3s5)
• Avoids dangling pointer error in Mozilla
• DoS in glibc & Windows
29
AVOIDING SECURITY EXPLOITS
30
• “Smashing the Stack for
Fun and Profit”
• Aleph One (AKA Elias
Levy), Phrack 49, August
1996
• It is a cook book for how
to create exploits for
“stack smashing”
attacks
• Prior to this paper,
buffer overflow attacks
were known, but not
widely exploited
• “Validate all input
parameters” is a security
principle going back to the
1960s
• After this paper, attacks
became rampant
• Stack smashing vulns are
massively common, easy
to discover, and easy to
exploit
31
• Buffer overflow:
• Program accepts string
input, placing it in a
buffer
• Program fails to correctly
check the length of the
input
• Attacker gets to
overwrite adjacent state,
corrupting it
• Stack Smash:
• Special case of a buffer
overflow that corrupts the
activation record
32
• Return address
• Overflow changes it to
point somewhere else
• “Shell Code”
• Point to exploit code that
was encoded as CPU
instructions in the
attacker’s string
• That code does
exec(“/bin/sh”)
hence “shell code”
33
• Why are we so
vulnerable to something
so trivial?
• Because C chose to
represent strings as null
terminated instead of
(base, bound) tuples
• Because strings grow up
and stacks grow down
• Because we use Von
Neumann architectures
that store code and data
in the same memory
• But these things are
hard to change …
mostly
• Try to move away from
Von Neumann
architecture by making
key regions of memory
be non-executable
• Problem: x86 memory
architecture does not
distinguish between
“readable” and
“executable” per page
34
• “Solar Designer” introduces the Linux non-executable
stack patch
• Fun with x86 segmentation registers maps the stack differently
from the heap and static data
• Results in a non-executable stack
• Effective against naïve Stack Smash attacks
• Bypassable:
• Inject your shell code into the heap (still executable)
• Point return address at your shell code in the heap
35
• Compile in integrity
checks for activation
records
• Insert a “canary word”
(after the Welsh miner’s
canary)
• If the canary word is
damaged, then your
stack is corrupted
• Instead of jumping to
attacker code, abort the
program
36
• Written in a few days by one intern
• Less than 100 lines of code patch to GCC
• Helped a lot that the GCC function preamble and function post
amble code generator routines were nicely isolated
• First canary was hardcoded 0xDEADBEEF
• Easily spoofable, but worked for proof of concept
37
• The random canary:
• Pull a random integer from
the OS /dev/random at
process startup time
• Simple in concept, but in
practice it is very painful to
make reading from
/dev/random work while
still inside crt0.o
• Made it work, but
motivated us to seek
something simpler
• “Terminator” canary:
• CR, LF, 00, -1: the
symbols that terminate
various string library
functions
• Rationale: will cause all
the standard string
mashers to terminate
while trying to write the
canary  cannot spoof
the canary and
successfully write beyond
it
• Still vulnerable to attacks
against poorly used
memcpy() code, but buffer
overflows thought to be
rare
38
• 1999, “Emsi” creates
the frame pointer attack
• Frame pointer stored
below the canary 
corruptible
• Change FP to point to a
fake activation record
constructed on the heap
• Function return code will
believe FP, interpret the
fake activation record, and
jump to shell code
• Bypasses both Terminator
and Random Canaries
• XOR Random Canary
• XOR the correct return
address with the random
canary
• Integrity check must
match both the random
number, and the correct
return address
39
• Focus on malware
detection and
prevention
• Nozzle
• Runtime detector for
heap spraying attacks
• False positive rates: 106
• Much harder to “fix”
than stack-based
buffer overruns
• Externally exploitable
bugs
• Overhead: 5-10%
• Zozzle
• Static/statistical detector
• False positive rates: 106
• Overhead: very small
40
Download