Monoculture and Diversity Nora Sovarel and Joel Winstead 21 September 2004 1

advertisement
Monoculture and Diversity
Nora Sovarel and Joel Winstead
21 September 2004
21 September 2004
Monoculture and Diversity
1
What is monoculture?
“the cultivation or growth of a single crop
or organism especially on agricultural or
forest land”
Merriam-Webster Online
21 September 2004
Monoculture and Diversity
2
Monoculture in Biology
The Irish Potato Famine, 1845-1850
• About half of Ireland’s population depended
on the potato crop
• The fungus Phytophthora infestans
appeared in Ireland in 1845
– Every potato farm in Ireland was vulnerable
• Consequences for Ireland:
– 1 million people died
– 1-2 million emigrated
21 September 2004
Monoculture and Diversity
3
What about computing?
Most statistics agree that Microsoft has at least
90% of the OS market.
For example: thecounter.com:
–
–
–
–
–
–
Win XP 56%
Win 98 20%
Win 2000 15%
Win NT 1%
Win 95 + Win 3x less than 1%
http://www.thecounter.com/stats/2004/August/os.php
21 September 2004
Monoculture and Diversity
4
Monocultures in Computing
•
•
•
•
•
Operating Systems – 90% Microsoft
Browsers – IE, Opera, Netscape
Web Servers – Apache, IIS
Routers – 85% Cisco
Processors – x86, Sparc
21 September 2004
Monoculture and Diversity
5
Why are we in this situation?
• Users – single interface
• System Administrators - uniform software
configurations
• Software Companies
– Lower distribution and maintenance costs
– Compatibility and file formats
21 September 2004
Monoculture and Diversity
6
What are the consequences?
• Same vulnerabilities for everyone
• One worm/virus for majority of systems
• Virus writers also like economy of scale:
– “write once, exploit everywhere”
21 September 2004
Monoculture and Diversity
7
What can we do ?
• opposite of monoculture
• diversity
• more than one
21 September 2004
Monoculture and Diversity
8
Diversity as a defense
If we’re not all running exactly the same
code:
– A single attack cannot compromise everybody
• epidemic attacks cease to scale
– An attacker won’t know what specific attack to
use against a particular target
• targeted attacks become more expensive
21 September 2004
Monoculture and Diversity
9
How many?
Are 10 variants of each piece of software
and hardware enough?
• normal operations disrupted with only a
small fraction of computers attacked
– Witty worm
• applications show same vulnerabilities
across OS's
21 September 2004
Monoculture and Diversity
10
We need more....
• We need every system to look different to
the attacker
• We need all systems to look exactly the
same to the users and administrators
• We need to be able to deploy and patch
systems quickly and economically
21 September 2004
Monoculture and Diversity
11
Can we have the benefits without
the disadvantages?
• Same user interface
• Different vulnerabilities
• Can the right kind of diversity be
generated automatically, without sideeffects?
21 September 2004
Monoculture and Diversity
12
Roadmap
• Threat Model
• Classes of attacks
• Diversity defences:
– Address space randomization
– Pointer randomization
– Instruction set randomization
– Keyed hash functions
• Effectiveness of these defences
21 September 2004
Monoculture and Diversity
13
Threat Model
• Threat: automated, destructive worms
– Require quick, automated, remote infection
– “Write-once, exploit everywhere”
– Assume attacker knows code, but not key material
• We are not:
– defending against local attackers
– defending against expensive brute-force attacks
– defending against targeted attacks
• Goal: make cost of automated infection high
• Crashing program is better than spreading worm
21 September 2004
Monoculture and Diversity
14
Classes of Attacks
• Code injection attacks
• Existing code attacks
• Algorithmic complexity attacks
21 September 2004
Monoculture and Diversity
15
Code Injection Attacks
•
•
•
•
Stack Smashing Attack
SQL Code Injection
Perl Code Injection
Double Pointer Attacks
21 September 2004
Monoculture and Diversity
16
Stack Smashing
return addr
argc
argv
a
b
c
return addr
buf[ ]
21 September 2004
main(int argc, char *argv[]) {
...
foo(a,b,c);
...
if (everything_is_kosher) {
exec(“/bin/sh”);
}
}
void foo(int a,int b,int c) {
char buf[100];
...
gets(buf);
...
}
Monoculture and Diversity
17
Stack Smashing
return addr
argc
argv
a
b
c
return addr
malicious
payload
buf[ ]
21 September 2004
main(int argc, char *argv[]) {
...
foo(a,b,c);
...
if (everything_is_kosher) {
exec(“/bin/sh”);
}
}
void foo(int a,int b,int c) {
char buf[100];
...
gets(buf);
...
}
Monoculture and Diversity
18
Stack Smashing
return addr
argc
argv
a
b
c
return addrreturn addr
buf[ ]
malicious
code
21 September 2004
• Payload overwrites return address
• New address can point to injected
code or existing code
• The payload can also overwrite local
variables
• Pointers to code can also occur in
other places
– virtual functions, callbacks
• Runtime type information on the
heap can also be overwritten
See “Smashing the Stack for Fun and Profit” in
Phrack #49 for more
Monoculture and Diversity
19
Existing Code Attacks
•
•
•
•
Format String Attack
Data Modification Attack
Integer Overflow
return-to-libc attacks
21 September 2004
Monoculture and Diversity
20
Why do these attacks work?
• The way code, stack, and data are laid out
in memory is fairly predictable
21 September 2004
Monoculture and Diversity
21
Why do these attacks work?
• The way code, stack, and data are laid out in memory is
fairly predictable:
Shared Libraries
Stack
Heap
Code
21 September 2004
Monoculture and Diversity
22
Defence Through Diversity
• Solution: randomise layout of address space:
Shared Libraries
Shared Libraries
Stack
Stack
Heap
Heap
Code
Code
21 September 2004
Monoculture and Diversity
23
What does this buy us?
• This can be done at link time
– low overhead
• Attacker must know or guess what address to
jump to
• The starting addresses of code, stack, heap, and
library segments add some entropy
– On a 32-bit system, about 16 bits for each segment
– Is this enough?
21 September 2004
Monoculture and Diversity
24
Attacking Address Space
Randomization
• Attacker needs address of only one
function to make successful attack
• Information leaks can reveal this
– format string vulnerability
• 16 bits can be brute-forced
– Shacham et al. show how to do this in 216
seconds over a network
21 September 2004
Monoculture and Diversity
25
Can we use a larger key?
• Can’t get more than 20 bits without
changing virtual memory system
• We can add padding to stack and code
• We can rearrange functions and data
structures in memory
– but this is tricky for shared libraries
• But an attacker needs only one address to
succeed
• 64-bit address spaces may help
21 September 2004
Monoculture and Diversity
26
Address Space Obfuscation and
Randomization
•
•
•
•
start address
reorder
gaps
encryption
21 September 2004
Monoculture and Diversity
27
Defenses - Stack
•
•
•
•
•
Canary Value
Write/Executable Pages
Padding
Local Variables Reordering
Parameter Reordering
21 September 2004
Monoculture and Diversity
28
Defenses – Memory Layout
Randomization
• Base Address Randomization
– stack
– heap
– text
– DLL
21 September 2004
Monoculture and Diversity
29
Defenses – Memory Layout
Randomization
•
•
•
•
Reordering of static variables
Reordering of routines
Gaps in heap
Gaps between routines
21 September 2004
Monoculture and Diversity
30
Pointer Encryption
• Rearranging address spaces doesn’t give us a
very large key
• Can we have diversity not just in how memory is
laid out, but in what pointers mean?
• What if we encrypted all pointers in the program?
– We could use a larger key
– Attacker must guess key in order to overwrite a return
address with something meaningful
21 September 2004
Monoculture and Diversity
31
PointGuard
• Developed by Cowan et al. at Immunix
• All pointers stored in memory are
encrypted
• Pointers are decrypted immediately before
dereference
• Pointers are encrypted before storing in
memory
• An attacker must guess key in order to
generate valid pointer to attack code
21 September 2004
Monoculture and Diversity
32
PointGuard code transformation
• Unlike address space transformations,
requires compiler changes
• Cleartext pointers appear only in registers
– Registers are not vulnerable to modification
– Encryption must be fast and efficient
• We don’t want to encrypt non-pointer data,
because that would mean encrypting the
buffer containing the attacker’s pointer
• Accessing libraries is tricky
21 September 2004
Monoculture and Diversity
33
Effectiveness of PointGuard
• Overhead is low
– but requires recompilation
– interaction with non-PG-aware code is tricky
• Defends against most code injection and
return-to-existing-code attacks
• Does not defend against all data
modification attacks
• Information leaks may reveal ciphertext,
allowing attacker to guess key
21 September 2004
Monoculture and Diversity
34
What if code gets in anyway?
• The previous techniques work by
preventing an attacker from jumping to
malicious code in the system
• What if we didn’t think of every way that
could happen?
• Defense-in-depth:
– make sure injected code won’t run no matter
how control is transferred
21 September 2004
Monoculture and Diversity
35
What must an attacker know?
• An attacker must know how to write code
to run on the targeted system
– SPARC exploit code will not run on x86
• What if no two computers had the same
instruction set?
– It would be difficult or impossible to write
exploit code that will run everywhere
21 September 2004
Monoculture and Diversity
36
Instruction Set Randomization
Kc, Keromytis, and Prevelakis:
– Encrypt the program’s instructions with a
different key for each copy of the program
– Decrypt each instruction at runtime
immediately before execution
– Attacker must know key in order to write code
that will decrypt to something meaningful
– Unsuccessful attack will cause illegal
instruction, address, or raise exception
21 September 2004
Monoculture and Diversity
37
How many bits do we need?
• Strong symmetric cryptography typically
requires a 128-bit key or larger to resist
known-plaintext attacks
• Large performance penalty to decrypt
• If we assume attacker doesn’t have our
ciphertext, we can use much smaller key
– 32-bit XOR may be good enough if our goal is
to prevent large-scale automated worms
21 September 2004
Monoculture and Diversity
38
Encoding schemes
• XOR:
– each word in legitimate code is XORed with the same
key
• Bit permutation:
– The bits in each word are rearranged according to a
key:
• log2(32!) = 160 bits, for 32-bit word
– Can move bits from one instruction to another
• In practice, key size is smaller:
– more than one way to encode an instruction
– more than one harmful instruction
21 September 2004
Monoculture and Diversity
39
Variable-sized instructions
• x86 instructions vary in size
• Some instructions are 1 byte
– 8 bit key insufficient
• Padding with NOPs has cost
– generally requires source code
• Solution 1:
– Pad branch targets only
• Solution 2:
– Encrypt words, not instructions
21 September 2004
Monoculture and Diversity
40
x86 Implementation
• Authors modified Bochs x86 emulator to
decrypt code at runtime
• Encrypted image consisting of kernel and
statically-linked binaries
• Cost of emulation is high for CPU-bound
processes
• Not so bad for I/O bound processes
• Reprogrammable processors could reduce
overhead (TransMeta Crusoe)
21 September 2004
Monoculture and Diversity
41
Interpreted Languages
• Some code injection attacks use VBScript,
SQL, Perl, or shell languages
• Append key material to keywords:
– e.g. foreach becomes foreach12345
• Overhead is negligible
– The languages are interpreted anyway
• Error messages may reveal key
21 September 2004
Monoculture and Diversity
42
Libraries
• Libraries present a problem:
– Use different keys for applications and
libraries
– Use single key for all system libraries
– Change the key from time to time
• Or:
– Statically link everything so that library code
uses same key as application
21 September 2004
Monoculture and Diversity
43
Other issues
• Self-modifying code won’t run
(Yes, gcc sometimes generates this)
• Significant performance penalty
• Attacker with ciphertext could brute-force
the key offline
– No defense against local attackers
– May be okay for defense against worms
• Does not resist existing code attacks
• Does not resist data corruption attacks
21 September 2004
Monoculture and Diversity
44
Algorithmic Complexity Attacks
• The Linux networking code uses hash tables to
classify packets
• Hash tables, binary trees, and other data
structures have good performance in average
case
– But poor performance in worst case
• An attacker who knew the hash function could
deliberately generate collisions
– This can force worst-case behavior
– This can cause denial of service
21 September 2004
Monoculture and Diversity
45
Diversity as a Defense
• Attacker can find collisions only if he
knows hash function
• What if every copy used a different hash
function?
• Solution: keyed hash functions
– Every copy uses same code
– Every copy uses a different key
– Attacker cannot force collisions without key
21 September 2004
Monoculture and Diversity
46
Effectiveness
• The techniques presented are orthogonal
• Other attacks:
– integer overflow
– data modification
• Other threat models:
– local attacker
– determined remote attacker
– denial of service
21 September 2004
Monoculture and Diversity
47
Other approaches
• StackGuard, StackShield, MemGuard, etc.
– bounds checking, canaries, non-executable
stack and heap
• Safe library routines, wrappers
• Sandboxes and safe languages (Java)
• Static analysis to detect (or prove the
absence of) buffer overflows
21 September 2004
Monoculture and Diversity
48
Will this prevent catastrophic
failures?
3. Things will be much like they are now: persistent
threats, common annoyances, but people will
still trust Internet for semi-critical tasks.
4. Technologies have emerged (and been
successfully deployed) that make epidemic
attacks a thing of the past. The Internet will be
trusted for the most critical tasks.
Do these techniques give us hope for (4)?
21 September 2004
Monoculture and Diversity
49
Download