Interactive deobfuscation

advertisement
Interactive deobfuscation
A thrift shop for static deobfuscation
whoami
• Security researcher
• Break stuff, reverse, make them better and
break again
• Part of nullsec non profit group
How it all started
blame this person =>
• Presumably a simple crackme
– Eventually discovered as wb aes
• I wanted to solve it statically
– Since running things is cheating
– Goal was to solve in lt a month
• A race I didn’t manage to fulfill when working statically
• Name is md5’ed
• Serial is transformed / permutated using
unknown function
Challenge archeology
• Overall the crackme was deployed into 2 main parts
• Deobfuscation
– Opaque predicates, lookup tables, value tables and
“spaghetti” code
• Cryptanalysis
– The original cipher was whitebox’ed
Deobfuscation
Deobfuscation - Layer0
• Found some jmps, decided to map them all
– find_lookuptables(“Mov <register>, dword ptr
[addr*4]”)
– Add xrefs, define locs
• IDA can’t map them all into graph views (due to size, more
RAM == bigger graph)
• After looking a bit there seem to be some logic
and different operations inside them
• However they all lead to the same path
eventually
Deobfuscation Layer1
• Removal of jmps and basic block identification
– All the obfuscation was done in a matter to effect the bb
itself, after a jmp to another table occurred everything was
restored
• Follow_jmps_by_addr(addr) to find bb boundaries
– Follow jcc until a jmp / push + ret sequence is found
– Compress it, remove jccs and make one BB
– In case xrefs, patch them together
Deobfuscation – Layer2
• Opaque predicates
• Ops which used to make the bb bigger
– Simple rule – operations are per bb and do not
exceed it
– Wrote a simple emulator to emulate bb and
optimize them to simple instructions
• 1 exception – do not touch lookup tables values
– More on this later
Deobfuscation – Layer 3
• Tables, and lots of them
– Apart from the jmptables which lead the way
• Tables are used as part of the cipher itself
• Key is dismantled inside them (more on this
later)
• Each table has a different role and some are
doubled for obfuscation
• FindTables to the rescue
Deobfuscation – Layer3
• FindTables basically taints memory and looks
for read of 16b tables
• Once it finds one it defines an array of 0xFF to
that addr
• All value tables are mapped using this way,
their usage however varies
Deobfuscation – Layer 4
• Once we have all the code cleaned we get
several consecutive lookup tables
• Loops are unrolled and become normal
repetitive ops (per round and state)
• All deobfuscated code was written into a new
section called “deobf” to make code reading
easier
• It is now time to move on to the cryptanalysis
stage
Cryptanal
Cryptanal
• The idea to automate every process is
infeasible and too much time consuming
• I decided to split the work into two main
stages:
– Operation identification
– Key extraction
• Both are used interactively
– Thus the name interactive deobfuscation
Cryptanal archeology
• Discovered BGE attacks from the academia
– Chow , Xiao
• sysk’s phrack article
• Eventually said FUCK YOU ALL gonna do it
myself w/o cryptic math
– Lack of algebra lessons and focus
Cryptanal – Layer0
• Actual wb code to encrypt a text
• Loops 9 times which made me quite frustrated
– Before discovering it was wb’ed
– After counting the loops by hand I thought it
might be AES
– But where’s the key ?
• LOLWTF ? md5(user) == wbaes.dec(serial,user_as_key)
– No, key must be *embedded*
• LOLWUT? md5(user) == wbaes.d/enc(serial,key) ??
– Output isn’t ascii so it could be both enc/dec
Cryptanal – Rijndael on a toe
• Several simple operations
– AddRoundKey, SubBytes , ShiftRows,MixColumns
• Some operations are linear and could be
replaced with their previous op
• The key to understand the attack is to sniff the
first round and extract the key
– In the future I found Eloi made my life harder
rijndael
whitebox(rijndael)
=>
evolves into =>
whitebox(rijndael)
• 1st transformation:
– ShiftRows is linear, and thus could be replaced in
op position with AddRoundKey
– SubBytes and ShiftRows could be replaced in op
position, as SubBytes does the same op
• Let “Linear” aka lin be
– lin(x) ^ lin(y) == lin(x ^ y)
• 2nd transformation
– It is possible to tranform and “compress” several ops into one
• By using XORtables and T/yboxes
– T/yibox
• Combine AddRoundKey and SubBytes into one operation
(lookup table) to emit 1 byte
• SubBytes(x ^ k[i])
• XORtable
– Transform MixColumns into a series of lookuptables,
particulary these tables are created by XORing one input byte
at a time through the MixColumns vector
• 3rd transformation
–
–
–
–
–
Append external encoding into the keys and lookuptables
Replace table values with random ones upon stage
41 => 32, 21 => 56, 12 => 4
Let G & F be encoding values
G() o AES() o F()
• Such that G & F cancel each other out eventually
– The external encoding is what
makes the whitebox variant
“attack resistant”
Attaq
Attaq 101
• Chow stated that his implementation doesn’t
leak any information
– In reality the XORtables and T/ytables still leaks
one nibble each time
– Not very helpful but still something
• Since the external encoding cancel each out it
might be worth to understand them
– Hint hint
Attaq!
• If we look at input encoding and output
encoding we know that they both cancel each
other out
• Thus if we manage to find the values of the
encoding we’d only have a “naked”
implementation of wbaes
• And then just sniff the first round key and
extract the key
Cryptbox
• Let’s try to look at MixColumns in the
Ty/itables transformations
• In a general idea it transforms
32b to 32b values
• Let P be input encoding
and Q output encoding
• Now let’s try to give an approximation about
the encoding values
• Billet suggests to zero out two bits out of the 4
and build up a new lookup table and perform
the transformation
• Once we have that we
construct a new lookup table
to their reversed operation
whitebox^whitebox
• We get 256 possible bijections
which can be used to build up
output encoding approximations
• The same operation is done to the input encoding
using
the acquired approximation we had for Q
• Once have the external encoding values we can
just sniff the first round key and extract the keys
FIN
• @shiftreduce
• shiftreduce@gmail.com
• Thanks to Eloi for making this challenge
• greetz @
#ecl,#nullsec,inbarr,nirizr,skier_,emdel,over,
Mikae, l_inc,
Download