Uploaded by totipic763

weizz-issta2020-slides

advertisement
WEIZZ: Automatic Grey-Box Fuzzing for
Structured Binary Formats
Andrea Fioraldi, Daniele Cono D’Elia and Emilio Coppa
@andreafioraldi
andreafioraldi@gmail.com
Format-aware Fuzzing
Input
Format
Model
Input
Generation
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
Program
Under Test
Crashes
2
Format-aware Fuzzing
●
LangFuzz
●
Peach
●
Spike
●
CSmith
●
...
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
3
Problems
●
Impossible if the input structure is unknown
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
4
Problems
●
Impossible if the input structure is unknown
●
May fail to find bugs related to syntactically invalid inputs
in parsers
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
5
Problems
●
Impossible if the input structure is unknown
●
May fail to find bugs related to syntactically invalid inputs
in parsers
●
Parser implementations do not always closely mirror format
specifications
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
6
Problems
●
Impossible if the input structure is unknown
●
May fail to find bugs related to syntactically invalid inputs
in parsers
●
Parser implementations do not always closely mirror format
specifications
●
Models take some time to be written by a human (and contains
simplifications)
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
7
Problems
●
Impossible if the input structure is unknown
●
May fail to find bugs related to syntactically invalid inputs
in parsers
●
Parser implementations do not always closely mirror format
specifications
●
Models take some time to be written by a human (and contain
simplifications)
●
Wrong models make fuzzing ineffective
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
8
Solutions?
●
Automatically learn the model from the actual implementation
of the parser
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
9
Solutions?
●
Automatically learn the model from the actual implementation
of the parser
●
Generate not always syntactically valid inputs
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
10
Solutions?
●
Automatically learn the model from the actual implementation
of the parser
○
(Approximation of) Taint Tracking
■
○
Machine Learning
■
○
[Learn&Fuzz] [REINAM]
Oracle based
■
●
[Tupni] [Autogram] [Polyglot] [Grimoire]
[GLADE]
Generate not always syntactically valid inputs
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
11
Coverage-guided Fuzzing
Coverage
Corpus
Input
Mutation
Program
Under Test
Crashes
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
12
Problems
●
Fail to explore deep paths behind parsers
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
13
Problems
●
Fail to explore deep paths behind parsers
●
Affected by roadblocks (multi-byte comparisons, checksums,
hashes, …)
if (hash(input[0:8]) != input[8:12]) exit(1)
if (input[12:16] == 0xABADCAFE) bug()
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
14
Structured Fuzzing
Corpus
Coverage
Input
Format
Model
Input
Mutation
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
Program
Under Test
Crashes
15
Structured Fuzzing
●
AFLSmart
●
Nautilus
●
Superion
●
Libprotobuf-Mutator
●
Zest
●
...
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
16
Bypass Roadblocks
●
Concolic Fuzzing
○
●
(Approximation of) Taint Tracking
○
●
[Driller] [QSYM] [Eclipser]
[TaintScope] [Vuzzer] [Angora] [Redqueen]
Sensitive feedbacks
○
[LAF-Intel] [CompareCoverage] [FuzzFactory] [IJON]
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
17
Bypass Roadblocks
●
Concolic Fuzzing
○
●
(Approximation of) Taint Tracking
○
●
[Driller] [QSYM] [Eclipser]
[TaintScope] [Vuzzer] [Angora] [Redqueen]
Sensitive feedbacks
○
[LAF-Intel] [CompareCoverage] [FuzzFactory] [IJON]
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
18
Idea #1
●
Reuse expensive analysis to bypass roadblocks previously
explored in past works to enable Structure-aware mutations
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
19
Bypass Roadblocks [Redqueen]
●
Mutations targeting magic byte comparisons (Input-To-State)
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
20
Bypass Roadblocks [Redqueen]
●
Mutations targeting magic byte comparisons (Input-To-State)
input: AAAABBBBCCCCBBBB
cmp eax, FFFF → eax = BBBB
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
21
Bypass Roadblocks [Redqueen]
●
Mutations targeting magic byte comparisons (Input-To-State)
input: AAAABBBBDDCCDDCC (equivalent in coverage)
cmp eax, FFFF → eax = BBBB
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
22
Bypass Roadblocks [Redqueen]
●
Mutations targeting magic byte comparisons (Input-To-State)
input: AAAABBBBDDCCDDCC (equivalent in coverage)
cmp eax, FFFF → eax = BBBB
new input: AAAAFFFFDDCCDDCC
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
23
Bypass Roadblocks [Redqueen]
●
Mutations targeting magic byte comparisons (Input-To-State)
●
Patch out checksum checks
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
24
Formats as an AST [Grimoire]
+
/
=
12
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
5
3
25
Not all formats are parsed into an AST
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
26
Comparisons for validation
if (chunk->size_field > SIZE_MAX)
error(“Invalid Chunk Size”);
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
27
Idea #2
●
Instead of using memory accesses to reconstruct the format
([Tupni] [Autogram]) use the comparisons instructions that are
likely validation checks
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
28
Idea #3
●
Don’t learn a model and use it to guide the fuzzer, but
reconstruct each time the structure and apply mutations.
This avoids the problem of having errors in the learning
process.
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
29
Weizz
●
Based on AFL 2.52b
●
Binary-only (QEMU)
●
Approximate Taint to bypass Roadblocks and learn information
about validation checks
●
Structural mutations based on that information (inspired by
[AFLSmart])
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
30
Architecture
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
31
Architecture
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
32
Architecture
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
33
GetDeps: Approximating Taint Tracking
Input: AAAABBBBCCCCDDDD
cmp eax, FFFF → eax = AAAA
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
34
GetDeps: Approximating Taint Tracking
Input: AAAABBBBCCCCDDDD
cmp eax, FFFF → eax = AAAA
Bitflip #1: BAAABBBBCCCCDDDD
cmp eax, FFFF → eax = BAAA
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
35
Detect Checksum Checks
●
One operand is I2S
●
The other operand is not I2S and GetDeps revealed dependencies
on some input bytes
●
The sets of their byte dependencies are disjoint
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
36
Input Tags
●
Comparison ID
●
Timestamp
●
Parent ID
●
Number of tags with the same ID
●
The Comparison ID of the inner checksum that guard this byte
●
Flags (which CMP operand, if this is a checksum field, …)
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
37
Many Comparisons affected by the same byte
1.
Prioritize Checksum fields
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
38
Many Comparisons affected by the same byte
1.
Prioritize Checksum fields
2.
Prioritize comparisons appeared earlier in time (possible
validation checks)
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
39
Many Comparisons affected by the same byte
1.
Prioritize Checksum fields
2.
Prioritize comparisons appeared earlier in time (possible
validation checks)
3.
Prioritize if the number of bytes influencing the comparison
are low
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
40
Fixing Checksum
●
Late-stage repair
●
Topological Sort (Tags have the info for this)
●
Unpatch false positives
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
41
Locating Fields
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
42
Locating Chunks
struct {
int type;
int x , y;
int cksm;
};
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
43
Locating Chunks
struct {
int type;
int x , y;
int cksm;
};
1.
Pick a tag type
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
44
Locating Chunks
struct {
int type;
int x , y;
int cksm;
};
1.
Pick a tag type
2.
Recurse if next Timestamp (ts) > current
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
45
Locating Chunks
struct {
int type;
int x , y;
int cksm;
};
1.
Pick a tag type
2.
Recurse if next Timestamp (ts) > current
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
46
Locating Chunks
struct {
int type;
int x , y;
int cksm;
};
1.
Pick a tag type
2.
Recurse if next Timestamp (ts) > current
3.
Go forward if next ID = current Parent
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
47
Locating Chunks
struct {
int type;
int x , y;
int cksm;
};
1.
Pick a tag type
2.
Recurse if next Timestamp (ts) > current
3.
Go forward if next ID = current Parent
4.
With a probability take untagged part and
recurse again
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
48
Mutating Chunks [AFLSmart]
●
Addition
●
Deletion
●
Splicing
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
49
Mutating Chunks [Weizz]
●
Addition
○
Select a chunk A and adds a chunk from another input in the queue with the
same parent ID in the first tag of A before or after A
Current input:
A
Other input:
Generated input:
B
A
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
B
50
Mutating Chunks [Weizz]
●
Deletion
○
Select a chunk and removes it
Current input:
A
Generated input:
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
51
Mutating Chunks [Weizz]
●
Splicing
○
Select a chunk A and replaces it with a chunk from another input in the
queue with the same comparison ID in the first tag
Current input:
A
Other input:
Generated input:
B
B
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
52
Evaluation
1.
Comparison with popular fuzzers over chunk-oriented programs
2.
New bugs found by Weizz
3.
Role of structural mutations and roadblock bypassing?
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
53
Evaluation
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
54
Evaluation (60% conf. intervals)
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
55
Evaluation (60% conf. intervals)
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
56
Evaluation
w/o I2S
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
57
Evaluation (60% conf. intervals)
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
58
Evaluation (60% conf. intervals)
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
59
Bugs
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
60
Evaluation
w/o I2S
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
w/o struct. mut.
61
Future Directions
●
Taint Tracking for large inputs
●
More chunk location heuristics
○
Exclude types of tags as starting point for a chunk
○
Apply traditional file-format reverse engineering algorithms based on memory
accesses to tags
●
Port to other OSes
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
62
Thank You
https://github.com/andreafioraldi/weizz-fuzzer
WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats
63
Download