Kernel Exploits

advertisement
Windows Kernel
Vulnerability
Research and
Exploitation
Gilad Bakas
Presentation Overview
•
•
•
•
•
•
•
•
•
•
•
•
Why Kernel?
What’s Different?
Technical Background
Vulnerability Research
Common and less common kernel bugs
Exploit Development
Examples: Use-after-free, DRM
Tips & Tricks
AFD.SYS: A simple kernel bug
Win32k.sys: A complex kernel bug
Windows 8 and the future
Questions
Why Kernel?
• Used to be much harder
• With the introduction of DEP, ASLR, UAC, Heap
checks, Protected Mode, Sandboxes etc in User
Mode, it’s now on par and sometimes even
easier
• In parallel to the securing of User Mode, a lot of
OS functionality was moved from User to Kernel,
and new User-to-Kernel interfaces were
introduced, thus drastically increasing the attack
surface in the Kernel
Why Kernel Cont’d
• In 64bit systems, the Driver Signing
Requirements prevent even an
Administrator from running unsigned
Kernel Code, making exploitation the only
alternative.
• Many times, the payload uses a driver
anyway, so it’s easier to just start from the
Kernel
This is already happening
• Quoted from a November 4 article by Gregg
Keizer’s on ComputerWorld:
“Microsoft has been extremely busy patching
pieces of the Windows kernel this year.
So far during 2011, Microsoft has patched 56
different kernel vulnerabilities with updates
issued in February, April, June, July, August and
October. In April alone, the company fixed 30
bugs, then quashed 15 more in July.”
What’s different?
• If something goes wrong, it goes REALLY
wrong. That means that even the smallest glitch
leads to a BSOD and a system reboot.
• No need to worry about permissions 
• You have to master a lot more technical
knowledge.
• No process boundaries. This means that you
have a lot more to play with, but also a lot more
to mess with
Required Technical Background
Things you have to master before you even begin
•
•
•
•
Kernel APIs
Memory Layout
Interrupts, IRQLs, DPCs, IRPs
Synchronization: Events, Spinlocks, Mutexes,
Timers, Semaphores, Resources
• Paging mechanism
• Intel System Architecture
• Device Driver structure, MJ functions, IOCTLs
Vulnerability Research
•
•
•
•
High-Level first
Look for complexity
Real challenge is figuring out where NOT to look
Interfaces where different teams have to
cooperate are more vulnerable – e.g. interaction
between User and Kernel
• Privilege Escalations are much easier than
Remote
• Multiple weak exploits can form one strong
attack
Vulnerability Research – Cont’d
•
Three approaches for finding vulnerabilities:
1. The High Level approach
“Let’s first understand how this whole system works, and only
then look for the holes.”
2. The Low Level approach
“This function looks complex, let’s break it down to the bit and
see if it has any bugs”
3. The blackbox / brute force / fuzzing approach
“Let’s make this mothafucka crash by trying every possible
input. We’ll worry about the details later”
Vulnerability Research – Cont’d
High Level approach process
–
Read everything you can possibly find about your target: white
papers, help documents, bug reports, users forums etc.
–
Ask yourself “If I had to develop this code, how would I do it?”
–
Research to find all the possible ways to develop that
functionality
–
Now that you know how it can be done, go to the code and
find out which method it uses.
You now have a high level overview of how your target works!
–
With the knowledge of how it works, think of possible
weaknesses, then search for them in the code
Vulnerability Research – Cont’d
Low Level approach process
– Identify logically and technically complex
functions / operations in the code
– Completely analyze and/or reverse engineer
the relevant functions, looking for bugs
– If a bug is found, figure out if there’s a way
to generate an attack flow that will trigger it
Vulnerability Research – Cont’d
Blackbox / brute force / fuzzing approach process
1.
Identify all the possible code inputs that are under your control
2.
Find out the structure of the input fields, including calculated
data like CRC, lengths, etc
3.
Test different inputs by:
4.
–
Manually thinking of inputs that are likely to be mishandled
–
Write a script/program to generate inputs for you
–
Use a fuzzing script / program / infrastructure
The goal is to get the best code coverage
Common Bugs
• Buffer overflows (stack and pool)
• NULL dereference
• Faulty input validation
Less common bugs
• Use-after-free
• Direct calling to User code
• Logical bugs
Exploit Development
• The more knowledge you have the better:
–
–
–
–
–
–
Constant memory addresses
Memory layouts
Heaps, Pools, Stacks
APIs, Objects
CPU
Assembly
• Creativity
• In kernel mode there are no process boundaries
– we can use everything
Example 1: Use-after-free
• The bug: object is freed but still kept in a linked list of
active objects
• To exploit we needed to get our own data into the freed
buffer *before someone else does*
• The solution:
– Use a bug in one driver to cause CPU starvation, thus reducing
the chance of anyone else stealing our buffer
– Use some DPC code in a second driver that allocates a buffer
with the right size and copies our data into it
– Activate the code in our target driver that uses the freed object,
causing our shellcode to be run
Example 2: DRM
• This isn’t actually a kernel exploit, but it’s a great
example of:
– An insecure interface between User and Kernel code
– How the interface points between different
development teams are likely to be the weakest links
• The system in this example is a DRM system
that was meant to prevent movies from being
copied by allowing playback on one machine
only
Example 2: DRM Cont’d
• Every movie is encrypted
• Decryption code is embedded in
the movie
• The code is different in each movie
• License is given per-computer,
based on hardware signature
• Accessing the hardware requires
Kernel code, so the decryption
code inside the movie calls a small
driver, that calls straight back into
the user code
Ring0 (Kernel-Mode)
PCCP Device
Driver
Ring3 (User-Mode)
Movie
Movie Decryption Code
Example 2: DRM Broken
• Hook DeviceIoControl
• Instead of calling the driver, we call the
user-mode code within a try…catch
statement
• Any attempt to do something nasty like
reading the BIOS and accessing
hardware will generate an exception
• We handle the exceptions by reading the
data from a file instead of the
BIOS/hardware
• The decryption code is tricked to “see” the
same hardware on all computers, so we
can use the same license everywhere!
Ring0 (Kernel-Mode)
PCCP Device
Driver
Ring3 (User-Mode)
Injected
Code
Movie
Movie Decryption Code
Tips & Tricks
• Arbitrary write – good places to overwrite:
Generic places with fixed addresses (per OS
build):
– Callback functions pointers
– Data segment variables
– GDT/LDT tables
(http://j00ru.vexillium.org/?p=290)
– Distpatch Table (used to be great, but it’s
blocked on new OSs)
Tips & Tricks – Cont’d
Non-fixed addresses that can be extracted:
• New technique (thanks to Gil Dabah and Tarjei Mandt):
The Window Handle Table is mapped to user address
space and contains Kernel pointers to objects with
function pointers in them (see
http://www.mista.nu/research/mandt-win32k-paper.pdf)
• So it’s possible to:
1. Create a kernel window (a window for which win32k created
and registered a window class so the window procedure is in
kernel, such as menu and tooltip)
2. Get the pointer to the kernel window object from the Handle
Table
3. Overwrite the WndProc Pointer
4. Send a message to the window to trigger the WndProc Pointer
Tips & Tricks – Cont’d
Non-fixed addresses that can be extracted:
– Other Kernel pointers that are passed to user
space.
– Some Win32k.sys syscalls are defined as
VOID or USHORT and leak a full or partial
kernel pointers in the return value (see
http://j00ru.vexillium.org/?p=762)
Tips & Tricks – Cont’d
• A BSOD is not the end:
– There’s plenty of code that runs AFTER an exception,
and many times that code calls callback functions that
can be overwritten. Especially with
ACCESS_VIOLATIONs, the flow goes to the pagefault handler first, so there are plenty of options for
attack
– Even inside KeBugCheck there are callbacks that can
be overwritten
– It’s a bit tricky to fix the context and resume normal
execution afterwards – but it can be done.
Tips & Tricks – Cont’d
• WOW64 processes:
– When running in a 32bit process on a 64bit system,
when you try to call NtQuerySystemInformation, all
the returned pointers are truncated to 32bit – Very
annoying!
– This can be overcome by using the built-in call gate to
temporarily switch to 64bit, call
NtQuerySystemInformation, then return to your 32bit
code. For more information see
http://vxheavens.com/lib/vrg02.html (and thanks to
Mark Dowd for the tip!)
– The 64bit TEB can be accessed directly without all
the switching to 64bit mess since it’s mapped at gs:0
From Kernel to User
• Many times, kernel exploits are required to
install user-mode payloads or perform
operations that require running user-mode code
• Contrary to common logic that says the more
power the better, launching user-mode code that
runs with SYSTEM privileges from the Kernel
can be very tricky (due to the lack of API and OS
support to do so).
• In the following slides we’ll go over the different
techniques that can be used, and the pros and
cons of each of them
From Kernel to User – Changing
the process token
Method
Change the token of a process we already have control
of (e.g. the process that launched the exploit) to a
SYSTEM token
Pros
– Easiest way to implement
– Very reliable
Cons
– More noisy
– The user-mode code has to do all the nasty work (e.g. injecting
code to a system process), making it vulnerable to AV and
security programs that hook user APIs
From Kernel to User – User-mode
APCs
Method
Queue a user-mode APC to a target thread already running in a
system process.
Pros
– Gets you directly to where you want to be
– Allows injection to any process on the system
Cons
– Only threads in Alertable state can be targeted, and there is no generic
way to find them. An alternative is to force a thread into an Alertable
state, but this breaks its waiting state, causing the wait function to return
mid-way, and may lead to system instability or crash.
– Very undocumented, and the relevant structures are different between
OS versions.
– Unless targeting a thread you have intimate knowledge of, this method
may lead to deadlocks if the thread is holding some locks when it enters
the wait state (e.g. the LoaderLock)
From Kernel to User – Thread
Hijacking
Method
Change the context of an existing thread in a system
process to execute injected code.
Pros
– Gets you directly to where you want to be
– Allows injection to any process on the system
Cons
– Restoring the context can be very difficult.
– Hijacking an arbitrary thread is extremely dangerous and may
cause deadlocks, instability, or crashes
From Kernel to User – Creating a
new thread
Method
Create a new user-mode thread in a system process
Pros
– An almost perfect solution, gets you exactly to where you want
without any dangers or context issues.
– Allows injection to any process on the system
Cons
– Extremely difficult to implement. In order for the new thread to
function it has to be registered with CSRSS. The APIs and
structures involved with that are complex, undocumented, and
change constantly with Windows updates.
From Kernel to User – API hooking
Method
Hook a user-mode API that you know is going to be
called or that you can cause to be called within a system
process
Pros
– Allows to inject directly into a system process
– Very reliable
Cons
– Finding a suitable API to hook may be difficult.
– This method isn’t generic, and will only work on system
processes that frequently call the targeted API.
An example of a simple Kernel
Exploit – AFD.SYS
• Let’s have a look at
afd!AfdGetRemoteAddress
Can someone see the problem?
// Attacker controls OutputBuffer and OutputBufferLength
void IOCTL_handler(...) {
[...]
try {
ProbeForWrite (OutputBuffer,
OutputBufferLength,
sizeof (UCHAR));
RtlCopyMemory(
OutputBuffer,
(PUCHAR)context+endpoint>Common.VcConnecting.RemoteSocketAddressOffset,
endpoint->Common.VcConnecting.RemoteSocketAddressLength
);
} except( AFD_EXCEPTION_FILTER(&status) ) {
Hint: [...]
}
}
ProbeForWrite doesn’t throw an exception when length == 0 regardless of
the actual pointer
AFD.SYS - Continued
• OK, so we can write data to any address
we want, including kernel addresses, but
we can’t really control what data!
• The data written looks like this:
02 00 XX XX YY YY YY YY, where XXXX
is the port and YYYYYYYY is the IP, and
there has to be an active TCP connection
for the function to work
• What to do?
AFD.SYS - Continued
• Our options:
– Overwriting a flag
– Maybe we don’t need full control of the data?
AFD.SYS - Continued
• Solution:
– We can connect to 127.0.0.1, that’s 7F 00 00 01.
– Port 445 is always open on Windows machines, that’s
01 BD, so now we have 01 BD 7F 00 00 01
– We want to overwrite a 32bit pointer, and we need an
address that we can easily allocate
– How about: 01 BD 7F 00 00 01? Intel is Little Endian,
so that gives us 0x00007FBD. Perfect!
– Now we just need a pointer to overwrite. Since this is
an old bug that only works on XP, we can just use the
Dispatch Table.
AFD.SYS - Exploit
1. Allocate page at 0x7fb0 and copy the
shellcode into it.
2. HookAddress = Dispatch table entry for
ZwQueryIntervalProfile
3. connect() to 127.0.0.1:445
4. DeviceIoControl(HANDLE)sock, 0x1203F,
NULL, 0, (PVOID)(HookAddress - 3), 0,
&Result, NULL)
5. ZwQueryIntervalProfile()
AFD.SYS - Demo
Demo
Walk-through of a complex Kernel
PE Exploit
• Thanks to my friend Gil Dabah (creator of
diStorm Disassembler)
• This bug was silently fixed by MS in
February
Background
• When registering a Window Class it’s possible to
request the OS to store some extra bytes with the
window object
• The extra bytes are appended to the WND
structure in the kernel:
WND Struct
Extra Bytes
Background - Continued
• Some special window types (Menus, Tooltips, etc)
also have some private data that can only be
accessed by the kernel:
WND Struct
Private
Data
Extra Bytes
Background - Continued
• To change the data on the extra bytes, applications
call the SetWindowLongPtr function with the index
into the extra bytes and a new value.
• The function then checks if the index provided is
within the private data or the user extra bytes. If
the index is within the private bytes, the function
fails, so normally it’s impossible to change the
private kernel data.
WND Struct
Private
Data
Extra Bytes
Background - Continued
• In order to check if the index is within the private data,
SetWindowLongPtr uses a table of window types with their
corresponding total allocated bytes size (WND struct + private).
• “Window type” refers to FNID, which is the real identifier of a window
type, from a list of hard-coded values (unlike its Class).
• The pseudo code for the check is:
if (index < (int)(window_class_alloc_sizes[fnid]-sizeof(WND)))) FAIL;
Window Type (FNID)
allocated bytes
Menu
0xa4 (WND size) + 4 (private
bytes) == 0xa8
Tooltip
0xa4 (WND size) + 4 (private
bytes) == 0xa8
.
.
.
The bug – part 1
• By using the undocumented and unexported function
RegisterClassExWOWW and supplying an internal
window type and a negative number for the extra bytes,
it’s possible to overwrite the table with our own value.
The bug is that the extra bytes value isn’t verified:
Window Type (FNID)
allocated bytes
Menu
0xa4 (WND size) + (-0xa8)
(extra bytes) == -4
Tooltip
0xa4 (WND size) + 4 (private
bytes) == 0xa8
.
.
.
The bug – part 1 - continued
• With the table altered to have a negative number
as the # of allocated bytes, the test code is tricked:
(index < (int)(window_class_alloc_sizes[fnid]-sizeof(WND)))) == always
FALSE
• we can now call SetWindowLongPtr with 0 as
index and change the private kernel data for the
window
WND Struct
Private
Data
Extra Bytes
The bug – part 2
• Now that we can overwrite private kernel data, we
need to find a window type that has some useful
stuff stored there.
• The Menu window type stores a pointer to a
structure, and during window destruction, a pointer
in that structure is NULLed, giving us the ability to
NULL any 32/64 bit value in the system – Bingo!
Exploitation
• Since the Menu window
private structure changes
between Windows versions,
we run the exploit twice:
– The first time overwriting the
pointer to the structure with a
pointer to some non-NULL
array, so that we can find out
the offset were the NULL is
put
– The second time with a
pointer to the address we
want to NULL minus the offset
found in the first stage
Private
WND Struct
Data
1st Time
NULL offset
Extra
Bytes
2nd Time
Bogus Data
Real Data
Bogus Data
Real Data
Bogus Data
Real Data
NULL
Bogus Data
Address to
Overwrite
NULL
Real Data
Exploitation - continued
• All that’s left now is to allocate our shellcode at
page 0, overwrite a function pointer, and then get it
called
• Easy! 
Exploitation - flow
1. Find the address of RegisterClassExWOWW using
diStorm
2. RegisterClassExWOWW() passing the FNID for a Menu
and a WNDCLASSEX structure with a negative number
for the extra bytes
3. CreateWindow()
4. SetWindowLongPtr() with a non-NULL array
5. DestroyWindow()
6. Find offset
7. Repeat steps 3-5, this time passing the actual address
to overwrite minus the offset
8. Get the overwritten pointer to be called
Windows 8 and the future
• Null-dereference is blocked – first 64k can’t be
allocated
• New integrity checks to the kernel pool memory
allocator (see
http://blogs.msdn.com/b/b8/archive/2011/09/15/p
rotecting-you-from-malware.aspx)
• Improved Linked-Lists security to protect against
corrupted/dangling list pointers (see
http://www.alex-ionescu.com/?p=69)
Questions?
gbakas@gmail.com
Download