Document 14845376

advertisement
Tappan Zee (North) Bridge
Mining Memory Accesses for Introspection
Brendan Dolan-Gavitt, Tim Leek, Josh Hodosh, and
Wenke Lee
ACM CCS 11/6/2013
This work is sponsored by the Assistant Secretary of Defense for Research & Engineering under Air Force Contract #FA8721-05-C-0002. Opinions,
interpretations, conclusions and recommendations are those of the author and are not necessarily endorsed by the United States Government.
Photo by joseph a
Background:
Introspection
• Often we want to provide isolation for a
security-sensitive process
• This isolation limits visibility: a semantic gap
that must be bridged
• Lots of existing work on automatically
bridging this gap:Virtuoso,VMST
ACM CCS 2013
Tappan Zee (North) Bridge
11/6/2013
Motivation
• Application introspection is hard
• Finding places to hook: also hard
• Neither scenario is supported with existing
techniques
• Currently done by manual reverse
engineering
ACM CCS 2013
Tappan Zee (North) Bridge
11/6/2013
Fond Memories
• We focus on a primitive operation used by
(nearly) all apps: memory access
• Tapping memory accesses provides a way to
find places to hook for OS and application
introspection
• Memory accesses provide streams of data
that illuminate the inner workings of apps
ACM CCS 2013
Tappan Zee (North) Bridge
11/6/2013
Tap Points
• Abstraction for “memory access made at
point in program”
• Idea: Data at a tap point of same “type”
• Consists of:
URLs
• Caller
• Program Counter
dmesg
• “Program” (CR3)
ACM CCS 2013
Tappan Zee (North) Bridge
11/6/2013
Context
00a3bdgoogle.comr2ab.tmpa2bc
google.comr2ab.tmp
(a)
strcpy
(b)
00a3bda2bc
r2ab.tmp
google.com
memcpy
strcpy←open_file
strcpy←open_url
(c)
00a3bda2bc
ACM CCS 2013
memcpy
Tappan Zee (North) Bridge
11/6/2013
Context
00a3bdgoogle.comr2ab.tmpa2bc
google.comr2ab.tmp
(a)
strcpy
(b)
00a3bda2bc
r2ab.tmp
google.com
memcpy
strcpy←open_file
strcpy←open_url
(c)
00a3bda2bc
ACM CCS 2013
memcpy
Tappan Zee (North) Bridge
11/6/2013
Context
00a3bdgoogle.comr2ab.tmpa2bc
google.comr2ab.tmp
(a)
strcpy
(b)
00a3bda2bc
r2ab.tmp
google.com
memcpy
strcpy←open_file
strcpy←open_url
(c)
00a3bda2bc
ACM CCS 2013
memcpy
Tappan Zee (North) Bridge
11/6/2013
Sample Tap Points
Content
Read
Write
!
00123456!
!
!
!
!
!
!
!
Tap
!
!
!
!
!
!
!
!
!
\Device\Harddisk!
00430F3C
00FFABED!
00123456!
00ABCDEF!
0064A42D!
\Device\Harddisk!
ACM CCS 2013
00646517
00646517
00646517
00646517
!
!
!
!
!
!
!
0064A423
0064A424
0064A427
0064A428
Code
Kernel!
Kernel!
Kernel!
Kernel!
0064A42D 00430E13 Kernel!
0064A42D 00430E15 Kernel
0064A423
0064A424
0064A427
0064A428
!
push
push
push
call
ebx!
[ebp+var_28]!
esi!
_memcpy!
_memcpy:!
[...]!
00430E08 shr
ecx, 2!
00430E0B and
edx, 3!
00430E0E cmp
ecx, 8!
00430E11 jb
short loc_430E3C!
00430E13 rep movsd!
00430E15 jmp
off_430F2C[edx*4]
Tappan Zee (North) Bridge
11/6/2013
Sample Tap Points
Content
Read
Write
!
00123456!
!
!
!
!
!
!
!
Tap
!
!
!
!
!
!
!
!
!
\Device\Harddisk!
00430F3C
00FFABED!
00123456!
00ABCDEF!
0064A42D!
\Device\Harddisk!
ACM CCS 2013
00646517
00646517
00646517
00646517
!
!
!
!
!
!
!
0064A423
0064A424
0064A427
0064A428
Code
Kernel!
Kernel!
Kernel!
Kernel!
0064A42D 00430E13 Kernel!
0064A42D 00430E15 Kernel
0064A423
0064A424
0064A427
0064A428
!
push
push
push
call
ebx!
[ebp+var_28]!
esi!
_memcpy!
_memcpy:!
[...]!
00430E08 shr
ecx, 2!
00430E0B and
edx, 3!
00430E0E cmp
ecx, 8!
00430E11 jb
short loc_430E3C!
00430E13 rep movsd!
00430E15 jmp
off_430F2C[edx*4]
Tappan Zee (North) Bridge
11/6/2013
Sample Tap Points
Content
Read
Write
!
00123456!
!
!
!
!
!
!
!
Tap
!
!
!
!
!
!
!
!
!
\Device\Harddisk!
00430F3C
00FFABED!
00123456!
00ABCDEF!
0064A42D!
\Device\Harddisk!
ACM CCS 2013
00646517
00646517
00646517
00646517
!
!
!
!
!
!
!
0064A423
0064A424
0064A427
0064A428
Code
Kernel!
Kernel!
Kernel!
Kernel!
0064A42D 00430E13 Kernel!
0064A42D 00430E15 Kernel
0064A423
0064A424
0064A427
0064A428
!
push
push
push
call
ebx!
[ebp+var_28]!
esi!
_memcpy!
_memcpy:!
[...]!
00430E08 shr
ecx, 2!
00430E0B and
edx, 3!
00430E0E cmp
ecx, 8!
00430E11 jb
short loc_430E3C!
00430E13 rep movsd!
00430E15 jmp
off_430F2C[edx*4]
Tappan Zee (North) Bridge
11/6/2013
Sample Tap Points
Content
Read
Write
!
00123456!
!
!
!
!
!
!
!
Tap
!
!
!
!
!
!
!
!
!
\Device\Harddisk!
00430F3C
00FFABED!
00123456!
00ABCDEF!
0064A42D!
\Device\Harddisk!
ACM CCS 2013
00646517
00646517
00646517
00646517
!
!
!
!
!
!
!
0064A423
0064A424
0064A427
0064A428
Code
Kernel!
Kernel!
Kernel!
Kernel!
0064A42D 00430E13 Kernel!
0064A42D 00430E15 Kernel
0064A423
0064A424
0064A427
0064A428
!
push
push
push
call
ebx!
[ebp+var_28]!
esi!
_memcpy!
_memcpy:!
[...]!
00430E08 shr
ecx, 2!
00430E0B and
edx, 3!
00430E0E cmp
ecx, 8!
00430E11 jb
short loc_430E3C!
00430E13 rep movsd!
00430E15 jmp
off_430F2C[edx*4]
Tappan Zee (North) Bridge
11/6/2013
Sample Tap Points
Content
Read
Write
!
00123456!
!
!
!
!
!
!
!
Tap
!
!
!
!
!
!
!
!
!
\Device\Harddisk!
00430F3C
00FFABED!
00123456!
00ABCDEF!
0064A42D!
\Device\Harddisk!
ACM CCS 2013
00646517
00646517
00646517
00646517
!
!
!
!
!
!
!
0064A423
0064A424
0064A427
0064A428
Code
Kernel!
Kernel!
Kernel!
Kernel!
0064A42D 00430E13 Kernel!
0064A42D 00430E15 Kernel
0064A423
0064A424
0064A427
0064A428
!
push
push
push
call
ebx!
[ebp+var_28]!
esi!
_memcpy!
_memcpy:!
[...]!
00430E08 shr
ecx, 2!
00430E0B and
edx, 3!
00430E0E cmp
ecx, 8!
00430E11 jb
short loc_430E3C!
00430E13 rep movsd!
00430E15 jmp
off_430F2C[edx*4]
Tappan Zee (North) Bridge
11/6/2013
Sample Tap Points
Content
Read
Write
!
00123456!
!
!
!
!
!
!
!
Tap
!
!
!
!
!
!
!
!
!
\Device\Harddisk!
00430F3C
00FFABED!
00123456!
00ABCDEF!
0064A42D!
\Device\Harddisk!
ACM CCS 2013
00646517
00646517
00646517
00646517
!
!
!
!
!
!
!
0064A423
0064A424
0064A427
0064A428
Code
Kernel!
Kernel!
Kernel!
Kernel!
0064A42D 00430E13 Kernel!
0064A42D 00430E15 Kernel
0064A423
0064A424
0064A427
0064A428
!
push
push
push
call
ebx!
[ebp+var_28]!
esi!
_memcpy!
_memcpy:!
[...]!
00430E08 shr
ecx, 2!
00430E0B and
edx, 3!
00430E0E cmp
ecx, 8!
00430E11 jb
short loc_430E3C!
00430E13 rep movsd!
00430E15 jmp
off_430F2C[edx*4]
Tappan Zee (North) Bridge
11/6/2013
Sample Tap Points
Content
Read
Write
!
00123456!
!
!
!
!
!
!
!
Tap
!
!
!
!
!
!
!
!
!
\Device\Harddisk!
00430F3C
00FFABED!
00123456!
00ABCDEF!
0064A42D!
\Device\Harddisk!
ACM CCS 2013
00646517
00646517
00646517
00646517
!
!
!
!
!
!
!
0064A423
0064A424
0064A427
0064A428
Code
Kernel!
Kernel!
Kernel!
Kernel!
0064A42D 00430E13 Kernel!
0064A42D 00430E15 Kernel
0064A423
0064A424
0064A427
0064A428
!
push
push
push
call
ebx!
[ebp+var_28]!
esi!
_memcpy!
_memcpy:!
[...]!
00430E08 shr
ecx, 2!
00430E0B and
edx, 3!
00430E0E cmp
ecx, 8!
00430E11 jb
short loc_430E3C!
00430E13 rep movsd!
00430E15 jmp
off_430F2C[edx*4]
Tappan Zee (North) Bridge
11/6/2013
Sample Tap Points
Content
Read
Write
!
00123456!
!
!
!
!
!
!
!
Tap
!
!
!
!
!
!
!
!
!
\Device\Harddisk!
00430F3C
00FFABED!
00123456!
00ABCDEF!
0064A42D!
\Device\Harddisk!
ACM CCS 2013
00646517
00646517
00646517
00646517
!
!
!
!
!
!
!
0064A423
0064A424
0064A427
0064A428
Code
Kernel!
Kernel!
Kernel!
Kernel!
0064A42D 00430E13 Kernel!
0064A42D 00430E15 Kernel
0064A423
0064A424
0064A427
0064A428
!
push
push
push
call
ebx!
[ebp+var_28]!
esi!
_memcpy!
_memcpy:!
[...]!
00430E08 shr
ecx, 2!
00430E0B and
edx, 3!
00430E0E cmp
ecx, 8!
00430E11 jb
short loc_430E3C!
00430E13 rep movsd!
00430E15 jmp
off_430F2C[edx*4]
Tappan Zee (North) Bridge
11/6/2013
Sample Tap Points
Content
Read
Write
!
00123456!
!
!
!
!
!
!
!
Tap
!
!
!
!
!
!
!
!
!
\Device\Harddisk!
00430F3C
00FFABED!
00123456!
00ABCDEF!
0064A42D!
\Device\Harddisk!
ACM CCS 2013
00646517
00646517
00646517
00646517
!
!
!
!
!
!
!
0064A423
0064A424
0064A427
0064A428
Code
Kernel!
Kernel!
Kernel!
Kernel!
0064A42D 00430E13 Kernel!
0064A42D 00430E15 Kernel
0064A423
0064A424
0064A427
0064A428
!
push
push
push
call
ebx!
[ebp+var_28]!
esi!
_memcpy!
_memcpy:!
[...]!
00430E08 shr
ecx, 2!
00430E0B and
edx, 3!
00430E0E cmp
ecx, 8!
00430E11 jb
short loc_430E3C!
00430E13 rep movsd!
00430E15 jmp
off_430F2C[edx*4]
Tappan Zee (North) Bridge
11/6/2013
Sample Tap Points
Content
Read
Write
!
00123456!
!
!
!
!
!
!
!
Tap
!
!
!
!
!
!
!
!
!
\Device\Harddisk!
00430F3C
00FFABED!
00123456!
00ABCDEF!
0064A42D!
\Device\Harddisk!
ACM CCS 2013
00646517
00646517
00646517
00646517
!
!
!
!
!
!
!
0064A423
0064A424
0064A427
0064A428
Code
Kernel!
Kernel!
Kernel!
Kernel!
0064A42D 00430E13 Kernel!
0064A42D 00430E15 Kernel
0064A423
0064A424
0064A427
0064A428
!
push
push
push
call
ebx!
[ebp+var_28]!
esi!
_memcpy!
_memcpy:!
[...]!
00430E08 shr
ecx, 2!
00430E0B and
edx, 3!
00430E0E cmp
ecx, 8!
00430E11 jb
short loc_430E3C!
00430E13 rep movsd!
00430E15 jmp
off_430F2C[edx*4]
Tappan Zee (North) Bridge
11/6/2013
Finding Tap Points
• Millions of tap points in even short slices of
system execution
• Gigabytes of data generated
• Need effective, efficient ways of finding
useful tap points
ACM CCS 2013
Tappan Zee (North) Bridge
11/6/2013
Monitoring Memory
Accesses
• Can instrument QEMU to log all memory
accesses – SLOW
• To get around this we use whole-system
record and replay
• Implemented in our open source dynamic
analysis framework (PANDA)
ACM CCS 2013
Tappan Zee (North) Bridge
11/6/2013
General Strategy
• Identify behavior of interest (“URL access”)
• Training: Create recording in which
behavior occurs (“visit google.com”)
• Search: Replay recording with
instrumentation and look for tap point with
desired content
ACM CCS 2013
Tappan Zee (North) Bridge
11/6/2013
Search Strategies
• Known knowns
• Known unknowns
• Unknown unknowns
ACM CCS 2013
Tappan Zee (North) Bridge
11/6/2013
Known Knowns:
String Searching
• Can be efficiently implemented using one
counter per tap point per search string
• Millions of tap points can be searched with
a few megabytes of memory
• We found: URLs, filenames, window titles,
SSL/TLS master keys
ACM CCS 2013
Tappan Zee (North) Bridge
11/6/2013
Known Knowns:
TLS/SSL Keys
• If exact key is not known in advance (e.g.,
malicious server) we cannot string search
• However, we can do trial decryption on all
48-byte strings seen at tap points
• TLS MAC allows us to verify decryption
ACM CCS 2013
Tappan Zee (North) Bridge
11/6/2013
DA can, additionally, re-render
ode via a module extracted from
ussion of this capability as it is
oring
represent the expression. We leave this generalization
future work, however.
Statistical Search and Clustering
Known5.4 Unknowns:
Information Retrieval
2, tap points need information
Keeping track of this informadge about the CPU architecture
g, and so we decided to encaple plugin. TZB’s other analyses
call stack to arbitrary depth by
not worry about the details de-
Collecting bigram statistics on data that passes thro
each tap point is an efficient way to enable “fuzzy” se
based based on some training examples, as well as enab
clustering. To implement this we collect bigram statistic
all tap points seen in execution, as well as for the exemp
the data seen at each tap point is thus represented
sparse vector with 65,536 elements (one for each poss
pair of bytes).
To search, we can then sort the tap points seen by ta
the distance (according to some metric) from the exemp
For our metric, we have chosen to use Jensen-Shannon
vergence [18], which is a smoothed and symmetrized ver
of the classic Kullback-Leibler divergence [16] (also kn
as information gain). We also examined the Euclidean
cosine distance metrics, but found their performance t
consistently worse. Jensen-Shannon divergence between
probability distributions P and Q is defined as:
• Given some training examples, find tap
points containing “similar” data
• We compute bigram byte statistics for each
mation, the callstack plugin exit is translated, looking for an
nstruction (currently, we look for
v lr, pc on ARM). If the block
then we push the return address
ach time that block executes.
m a function does not require any
Before the execution of every baer the address we are about to
stack; if so, we pop it. We only
address of the basic block, beCCS 2013
terminates a ACM
basic block,
so the
tap point & training examples
• Sort by Jensen-Shannon divergence (similar
to mutual information / Kullback–Leibler
divergence)
✓
◆
JSD(P, Q) = H
P +Q
2
H(P ) + H(Q)
2
where H is Shannon entropy.
Tappan
Zee (North)
11/6/2013
Bigram
collectionBridge
is done by
maintaining, for each
Finding dmesg
(training)
ACM CCS 2013
Tappan Zee (North) Bridge
11/6/2013
dmesg
FreeBSD
MINIX
Haiku
Linux
ACM CCS 2013
Tappan Zee (North) Bridge
11/6/2013
Unknown Unknowns:
Clustering
• Group tap points containing “similar” data
together
• Algorithm: K-means with Jensen-Shannon as
distance metric
• Unsupervised learning – no training
ACM CCS 2013
Tappan Zee (North) Bridge
11/6/2013
23456
gptid/85a1fb05-3e8f-11e2-80e5-525400123456
r0w0e0
err#0
e=box,label=’’PART\nada0\nr#2’’];
l=’’r1w0e0’’];
xc1e9eb00;
ABEL
a0p3
#3
ufsid/50bec59dcc135902
r0w0e0
err#0
FreeBSD Boot Cluster
DEV
ada0p3
r#3
in/stt/sbin/sysctl/bin/ps/sbin/sysctl/sbi
sbin/md/sbin/sysctl/sbin/sysctl/bin/ken/s
/bin/ps/sbin/sysctl/sbin/sysctl/sbin/sysc
r0w0e0
r0w0e0
n/ps/bin/dd/sbin/sysctl/bin/dat/bin/df/sb
LABEL
ada0p2
r#3
LABEL
ada0p2
r#3
r0w0e0
r0w0e0
el00000000-0000-0000-0000-000000000000000
ada0p3
/faa_N_=_peOfA=fA=feTr=tul.n=_eo/.b_Yt_vtectvifat=a=-sd_Ee
000-00000000000000000000-0000-0000-0000-0
r0w0e0
Ofu=u_0y:nF:tRseeeeEfciOtmdtuinlrlrrlpp/nppfpcepinl=l=.llN
err#0
lNlgllpl_.4l_l_2/l_l_22lileldlylo- 21laltlat=rrrsbgrskgni/
5d-3e9f-11e2-a557-525400123456993c915d-3e
00123456/boot/kernel/kernel/boot/kernel/k
russian|Russian Users Accounts: :charset=KOI8-R:
:
r#2
DEV
gptid/859ead5b-3e8f-11e2r#4
r#2
r0w0e
r0w0e0
DEV
ada0p3
r#3
n/rcorde/bin/cat/sbin/md/sbin/sysctl/sbin/sysctl/bin/ken/s
bin/dumpon/bin/ln/bin/ps/sbin/sysctl/sbin/sysctl/sbin/sysc
r0w0e0
r0w0e0
tl/sbin/sysctl/bin/ps/bin/dd/sbin/sysctl/bin/dat/bin/df/sb
1009a80’’>
/boot/kernel/kernel00000000-0000-0000-0000-000000000000000
ada0p3
00000-0000-0000-0000-00000000000000000000-0000-0000-0000-0 r0w0e0
00000000000993c915d-3e9f-11e2-a557-525400123456993c915d-3e err#0
9f-11e2-a557-525400123456/boot/kernel/kernel/boot/kernel/k
...
se/9.0.0/etc/rc.d/newsyslog
197947 2009-1
CONFIG_AUTOGENERATED
ugb $ modulesoptions
ident
GENERIC
machine i386
ACM CCS 2013
r0w0e0
r0w0e0
gptid/85a1fb05-3e8f-11e2-80e5-525400123456
r0w0e0
err#0
10362c0’’>
ada0p3
r#3
/sbin/in/bin/sh/bin/stt/sbin/sysctl/bin/ps/sbin/sysctl/sbi
e>
r1w0e0
DEV
DEV
PART
ufsid/50bec59dcc135902
ada0 r#4
ada0
nss_compat.so.1dhclientShared object ‘‘nss_compat.so.1’’ n
r0w0e0
r0w0e0
ot found,
required by ‘‘dhclient’’nss_nis.so.1dhclientShar
ed object ‘‘nss_nis.so.1’’ not found, required by ‘‘dhclie
nt’’nss_files.so.1dhclientShared object ‘‘nss_files.so.1’’
digraph geom {
z0xc1d8de00 [shape=box,label=’’PART\nada0\nr#2’’];
z0xc1f4f640 [label=’’r1w0e0’’];
z0xc1f4f640 -> z0xc1e9eb00;
LABEL
DEV
ada0p2
r#3
ada0p2
r1w0e0
err#0
lang=ru_RU.KOI8-R:
:
:passwd_format=md5:
:co
DEV
DEV
pyright=/etc/COPYRIGHT:
:welcome=/etc/motd:
:sete
CONFIG_AUTOGENERATED
gptid/85a9469d-3e8f-11e2-80e5-525400123456
gptid/85a1fb05-3e8f-11e2-80e5-525400123456
nv=MAIL=/var/mail/$,BLOCKSIZE=K,FTP_PASSIVE_MODE=YES:
r#4
r#4
gptid/85a9469d-3e8f-11e2-80e5-525400123456
r0w0e0
err#0
VFS
s.ada0p2
r#3
r1w0e0
ufsid/50bec59dcc135902
r0w0e0
err#0
ada0
LABEL
ada0p2
r#3
LABEL
VFS
r1w0e0
ada0p2
s.ada0p2
r#3 err#0 r#3
r0w0e0
r0w0e0
r1w0e0
gptid/859ead5b-3e8f-11e2r0w0e
err#0
DEV
ada0p2
r#3
r0w0e0
DISK
ada0
ada0p2
r#1
r1w0e0
err#0
DEV
ada0
r#2
Tappan Zee (North) Bridge
PART
ada0
r#2
11/6/2013
LABEL
ada0p1
r#3
r0w0e0
ada0p1
r0w0e0
err#0
Clustering Evaluation
• Manually labelled ~3000 tap points
• Compared human labels to clustering
• Very poor agreement
• Humans said most things were “binary
pattern”
ACM CCS 2013
Tappan Zee (North) Bridge
11/6/2013
TZB is General
• Four introspection tasks: URL access, file
access, SSL/TLS keys, boot messages
• Five operating systems: Windows 7, Linux,
FreeBSD, Minix, and Haiku
• Two architectures: x86_64, ARM
ACM CCS 2013
Tappan Zee (North) Bridge
11/6/2013
Limitations
• Need to know data encoding
• One caller may not be enough (or too
much!)
• Contents may be written (temporally) out
of order
ACM CCS 2013
Tappan Zee (North) Bridge
11/6/2013
ACM CCS 2013
Tappan Zee (North) Bridge
11/6/2013
Future Work
• Clustering results suggestive, but more
investigation is needed
• Application to memory forensics –
online vs. offline
• Tap points in JITed code
ACM CCS 2013
Tappan Zee (North) Bridge
11/6/2013
Summary
• TZB locates tap points: places to interpose
for extracting security-relevant information
• Analysis to find tap points is heavyweight,
but is automated and only needs to be
done once, offline
• Improves on state of the art: expensive and
time-consuming manual reverse engineering
ACM CCS 2013
Tappan Zee (North) Bridge
11/6/2013
Thanks for Listening
• All code related to this project is available
at https://github.com/moyix/panda
• Contact: brendan@cc.gatech.edu
http://cc.gatech.edu/~brendan/
• Questions?
ACM CCS 2013
Tappan Zee (North) Bridge
11/6/2013
Performance
• Depends on analysis
• String search: ~10x
• SSL/TLS: ~500x
ACM CCS 2013
Tappan Zee (North) Bridge
11/6/2013
Download