“Hex sucks. A better mapping must be possible.” -Dan Kaminsky

advertisement
“Hex sucks.
A better mapping
must be possible.”
-Dan Kaminsky
Visual Forensic Analysis
and Reverse Engineering
of Binary Data
Gregory Conti
Erik Dean
United States Military Academy
West Point, New York
gregory.conti@usma.edu
erik.dean@usma.edu
The views expressed in this presentation
are those of the authors and do not reflect
the official policy or position of the United
States Military Academy, the Department
of the Army, the Department of Defense or
the U.S. Government.
http://www.whitehouse.gov/omb/budget/fy2005/images/justice-7.jpg
A brief fashion statement...
http://www.amazon.com/gp/product/images/B000UI91XY/sr=1-12/qid=1216606930/ref=dp_image_text_0?ie=UTF8&n=377110011&s=watches&qid=1216606930&sr=1-12
http://www.amazon.com/gp/product/images/B000RWJG6U/sr=1-5/qid=1216606930/ref=dp_image_text_0?ie=UTF8&n=377110011&s=watches&qid=1216606930&sr=1-5
Outline
• The Problem – Tiny Windows
• Background and Motivation
• Related Work
• Moving Beyond Hex
• System Design
• Case Studies
• Demos
Releasing Two Tools...
data
file
doc
xls
txt…
operated on by
applications
executed by OS
exe
ELF
PE...
01010
10101
01010
other
special cases
core dump
pagefile.sys
hiberfil.sys…
memory process memory
cache…
network
packets…
Ida Pro
OllyDBG
BinNavi (Zynamics)
BinDiff (Zynamics)…
high
insight
Filemon
Regmon…
011
lower
insight
hex editors
hexdump
grep & diff
strings
general purpose
objdump
original
application
precise application
strings /grep/diff
H:\Datasets>strings 20040517_homeISP.pcap | more
Strings v2.4
Copyright (C) 1999-2007 Mark Russinovich
Sysinternals - www.sysinternals.com
0hF
M@y
7bs
Z19Z
MICROSOFT NETWORKS
WINDOWS USER
Microsoft Security Bulletin MS03-043
Buffer Overrun in Messenger Service Could Allow Code Execution
(828035)
Affected Software:
Microsoft Windows NT Workstation
Microsoft Windows NT Server 4.0
Microsoft Windows 2000
...
Hex Editors
Hex Workshop
011
WinHex
Ida Pro
OllyDBG
BinNavi (Zynamics)
BinDiff (Zynamics)…
high
insight
Filemon
Regmon…
011
lower
insight
hex editors
hexdump
grep & diff
strings
general purpose
objdump
original
application
precise application
Ida Pro
OllyDBG
BinNavi (Zynamics)
BinDiff (Zynamics)…
high
insight
Filemon
Regmon…
011
lower
insight
hex editors
hexdump
grep & diff
strings
general purpose
objdump
original
application
precise application
F-Secure Malware
http://www.f-secure.com/weblog/archives/00000662.html
IDA Pro
v5.1
http://www.hex-rays.com/idapro/
Zynamics BinDiff
http://www.zynamics.com/content/_images/bindiff_scr2.gif
Zynamics BinNavi
http://www.zynamics.com/index.php?page=binnavi
Ida Pro
OllyDBG
BinNavi (Zynamics)
BinDiff (Zynamics)…
high
insight
Filemon
Regmon…
011
lower
insight
hex editors
hexdump
grep & diff
strings
general purpose
objdump
original
application
precise application
nwdiff
http://computer.forensikblog.de/en/2006/02/compare_binary_files_with_nwdiff.html
http://www.geocities.jp/belden_dr/ToolNwdiff_Eng.html
Dot Plots & Visual BinDiff
(Kaminsky)
Self-Similarity in
a single file. (.NET Assembly)
Diffing Two Files
images: Dan Kaminsky, CCC2006
Framework
• File Independent Level
– Entropy
– Byte Frequency
– N-Gram Analysis
– Strings
– Hex / Decimal / ASCII
– Bit Plot (2D/3D)
– File Statistics
• File Specific Level
– Complete or Partial Knowledge of File
Structure
– For Example, Metadata
Textual
Hex/ASCII
Detail View
Traditional
Textual
Utilities
(strings...)
Graphical
Displays
Machine Assisted Mapping and Navigation
Hex Editor Core
Towards a Visual Hex Editor
• Malware Analysis
• Locate Embedded Objects
– Encoding / Encryption
•
•
•
•
•
•
•
•
•
Audit Files for Vulnerabilities
Compare files (Diffing)
Cracking
Analyze Unknown/Undocumented File
Format
Cryptanalysis
Perform Forensic Analysis
File System Analysis
Reporting
File Fuzzing
Goals
• Handle Large Files
• Many Insightful Windows
• Big Picture Context
• Improved Navigation
• Data Files / Executable Files
• Hex Editor best practices is the
foundation
• Support Art & Science
• Provide rapid analysis capability
• Inform machine processing development
• Fun
Two Approaches
• vizbin
– C# VS 2008
• danglybytes
– C# VS 2005
vizbin
• Textual: Text/ASCII
• Graphical: Byte plot, Byte Frequency Plot (overview + detail)
• Interaction: navigation arrows, search, entropy display
• Plug and Play Design: Designed to allow dynamic addition of new Viz’s
Interface
Memory Management
Problem: .NET limit’s image height
and width to <= 65535
Solution: Create a table with each
cell containing a start and end
offset into the file to be visualized
Byte Plot
1
1
255
108
0
40
...
480
640
Color Coding
• ASCII
• Entropy
• Byte Frequency
Explorer.exe
(color coding: ASCII enhanced)
Printable
ASCII
CRLF
Tab or
Space
Other
ASCII
Word 2007 Document
(color coding: entropy)
ASCII Viz
LZW Viz
Packing
(color coding: byte frequency)
Original
Explorer.exe
cmd.exe
UPX
What if we could use the
same Viz techniques to
find hidden messages in
other files?
Erik Demo
Embedded Messages – MP3
Stego
Overview of the MP3 in question
But on closer inspection
there’s something out of
the ordinary which looks a
bit like ASCII text…
Embedded Messages – MP3
Stego
On closer inspection of the suspect section of the file,
we find…
VizBin Future Work / Lessons
Learned
• Complete ‘Magic File’ search and
display
• Change the memory table model
to a memory array model
– Memory table navigation is not
intuitive enough
• Add navigation through overview
image
• Add interactive interrogation of
detail image
danglybytes
• Textual: Text/ASCII, Strings, ByteCloud
• Graphical: Bitplot, BytePlot, RGBPlot, BytePresence, ByteFrequency,
Digram, Dotplot
• Interaction: VCR, Memory Map, Color Coding
Traditional Views
Hex / ASCII View
Strings
Strange Attractors and TCP/IP
Sequence Number Analysis
(Michal Zalewski)
• http://lcamtuf.coredump.cx/oldtcp/tcpseq.html
• http://lcamtuf.coredump.cx/newtcp/
Digraph View
black hat
bl
la
ac
ck
k_
_h
ha
at
(98,108)
(108,97)
(97,99)
(99,107)
(107,32)
(32,104)
(104,97)
(97,116)
Digraph View
0,1,
...
Byte 0
Byte 1
32,108
...
Byte 255
98,108
255
uuencoded
slashdot.org
compression
encryption
.txt
incrementing
words
constrained pairs
Bit Plot
1
1
1
1
0
1
...
480
640
Byte Plot Example
(Word Document)
Byte Presence
0
255
255
108
0
40
128
255
RGB Plot
1
1
0
0
0
200
0
0
480
640
Display Comparison
Pixels/Byte
19”
Monitor
Gain
Textual
Hex
300 pixels/byte 4.4 KB
N/A
Byte View
1 pixel/byte
1.3 MB
300x
RGB View
3 bytes/pixel
3.9 MB
900x
Dot Plots
• Jonathan
Helfman’s
“Dotplot
Patterns: A
Literal Look at
Pattern
Languages.”
• Dan Kaminsky,
CCC & BH 2006
DotPlot Examples
Images: Jonathan Helfman, “Dotplot Patterns: A Literal Look at Pattern Languages.”
DotPlot Examples
Images: Jonathan Helfman, “Dotplot Patterns: A Literal Look at Pattern Languages.”
Kaminsky DotPlot
Byte 0, Byte 1, ... Byte N
Byte 0
Byte 1
...
Byte N
O(N2)
Modified for Interactivity
Byte 0, Byte 1, ... Byte N
Byte 0
Byte 1
...
Byte N
500x500
O(N)
English Text
Bitmap Image
Compressed Audio
Byte Clouds
Tag Cloud
Smashing the Stack
for Fun and Profit
http://tagcrowd.com/
Byte Cloud
Byte Frequency
(word document)
0
255
Unencrypted
0
255
AES
Quick Assessments
• Alphabet in use
• Use of encryption
• Application file format exploration
• Fixed length structures
• Variable length structures
• Bitmaps
Pure Edge
(.xfdl)
0A
31
37
42
48
4E
54
5A
66
6C
72
78
22
32
38
43
49
4F
55
61
67
6D
73
79
2B
33
39
44
4A
50
56
62
68
6E
74
7A
2D
34
3B
45
4B
51
57
63
69
6F
75
2E
35
3D
46
4C
52
58
64
6A
70
76
30
36
41
47
4D
53
59
65
6B
71
77
Encryption
unencrypted
XOR
unencrypted
AES
observe format changes
(~chosen plaintext attack)
.tiff
Insert Image from File
(.tiff)
Fixed Length Structure
Neverwinter Nights Database File
tvDebug.log
• Created by
ZoneAlarm
Firewall
• Can grow
quite large
• 6.7M in this
case
• Binary
• Seeking big
picture
context
• ~240 byte
wide data
structures
• Vertical
bands
identify
identical
values
• Exceptions
visible
Alphabet
(90 values)
01 03 05 09 0B 0D 0F
11 13 15 17 19 1D 1F
21 23 25 29 2B 2D 2F
31 33 35 37 3B 3D 3F
42 44 45 49 4B 4D 4F
51 53 54 55 56 57 59
5B 5D 5F 67 6D 80 82
84 86 88 8A 8C 8E 90
92 94 96 9A 9C 9E A0
A2 A4 A6 A8 AA AC AE
B0 B2 B4 B6 B8 BA BC
BE C0 C6 C8 CA CE D0
D2 D4 D6 DA DE EA
Variable Length Structure
Thumbs.db
See http://www.acquisitiondata.com/white_papers/thumbsdbfiles.pdf
for a well written white paper.
pcap
8 bits / pixel
pcap
packet length / color coding by protocol
pcap
multicolumn
Demo
(Firefox hdmp)
Firefox .hdmp
Firefox .hdmp
Firefox .hdmp
Firefox .hdmp
Redacted
PDF...
http://entertainment.slashdot.org/article.pl?sid=08/05/20/0228229
Example .NET Image Formats
Format8bppIndexed
Specifies that the format is 8 bits per pixel, indexed.
Format16bppGrayScale
The pixel format is 16 bits per pixel. The color information
specifies 65536 shades of gray.
Format16bppRgb565
Specifies that the format is 16 bits per pixel; 5 bits are used
for the red component, 6 bits are used for the green
component, and 5 bits are used for the blue component.
Format1bppIndexed
Specifies that the pixel format is 1 bit per pixel and that it
uses indexed color. The color table therefore has two colors
in it.
Format24bppRgb
Specifies that the format is 24 bits per pixel; 8 bits each are
used for the red, green, and blue components.
Format32bppArgb
Specifies that the format is 32 bits per pixel; 8 bits each are
used for the alpha, red, green, and blue components.
Format48bppRgb
Specifies that the format is 48 bits per pixel; 16 bits each
are used for the red, green, and blue components.
Format64bppArgb
Specifies that the format is 64 bits per pixel; 16 bits each
are used for the alpha, red, green, and blue components.
http://msdn.microsoft.com/en-us/library/system.drawing.imaging.pixelformat(VS.80).aspx
Weaknesses
• entire file may be extracted from
bit/byte/RGB
– May trigger AV or IDS
– 8bit/byte steg
• Screams for big monitor
• Better memory management
– ~300MB+
• Unicode
Future Work
• Plug-ins / Editable Config Files
– Visualizations
– Encodings
• Saving state
– Memory Maps
• Improving Interaction
– What works / What doesn’t
• Multiple Files / File Systems
• REGEX search
• Automated Memory Map Generation
DAVIX
(Jan Monsch and Raffy Marty)
DAVIX Workshop
DEFCON Breakout Room
Sunday 2PM-4PM
http://www.secviz.org/node/89
Communities
http://secviz.org/
http://vizsec.org/
“The place to share, discuss,
challenge, and learn about security
visualization.”
“vizSEC is a research community for
computer security visualization.”
Raffy Marty
Splunk
John Goodall
Secure Decisions
VizSEC 2008
http://www.vizsec.org/workshop2008/
More Information
• “Visual Reverse Engineering
of Binary and Data Files.”
Gregory Conti, Erik Dean,
Matthew Sinda, Benjamin
Sangster. VizSEC 2008.
– Publicly available September
• Security Data Visualization
(No Starch Press)
• Applied Security
Visualization
(Addison-Wesley)
Acknowledgements
Damon Becknell, Jon Bentley, Jean
Blair, Sergey Bratus, Chris Compton,
Tom Cross, Ron Dodge, Carrie Gates,
Chris Gates, Joe Grand, Julian
Grizzard, Toby Kohlenberg, Oleg
Kolesnikov, Frank Mabry, Raffy Marty,
Brent Nolan, Gene Ressler, Ben
Sangster, Dino Schweitzer, Matt
Sinda, and Ed Sobiesk
Feedback Welcome
• Visualization ideas
• Usage feedback
• Desired functionality / feature requests
• Plug-in architecture recommendations
• DanglyBytes is here...
www.rumint.org/db.zip
• We’ll have a link to VizBin up at
www.rumint.org shortly
Survey
Gregory Conti gregory.conti@usma.edu|
Erik Dean
erik.dean@usma.edu
Download