Introduce LLVM from a hacker's view. Loda chou. hlchou@mail2000.com.tw For HITCON 2012 Who am I? I am Loda. Work for 豬屎屋 (DeSign House). Be familiar for MS-Windows System and Android/Linux Kernel. Sometimes…also do some software crack job. Like to dig-in new technology and share technical articles to promote to the public. Motto The way of a fool seems right to him ,but a wise man listens to advice. (Proverbs 12:15) 愚妄人所行的,在自己眼中看為正直;惟智慧人肯聽人的勸教 What is LLVM? Created by Vikram Adve and Chris Lattne on 2000 Support different front-end compilers (gcc/clang/....) and different languages (C/C++,Object-C,Fortran,Java ByteCode,Python,ActionScript) to generate BitCode. The core of LLVM is the intermediate representation (IR). Different front-ends would compile source code to SSA-based IR, and traslate the IR into different native code on different platform. Provide RISC-like instructions (load/store…etc), unlimited registers, exception (setjmp/longjmp)..etc Provide LLVM Interpreter and LLVM Compiler to run LLVM application. Let's enjoy it. Android Dalvik RunTime Dalvik ByteCode AP in dex/odex Dalvik Virtual Machine Dalvik ByteCode Framework in JAR Java Native Interface Partial Dalvik AP implemented in Native .so Native .so library Linux Kernel The features of Dalvik VM Per-Process per-VM JDK will compile Java to Sun’s bytecode, Android would use dx to convert Java bytecode to Dalvik bytecode. Support Portable Interpreter (in C), Fast Interpreter (in Assembly) and Just-In Time Compiler Just-In-Time Compiler is Trace-Run based. By Counter to find the hot-zone Would translate Dalvik bytecode to ARMv32/NEON/Thumb/Thumb2/..etc CPU instructions. LLVM Interpreter RunTime LLVM BitCode AP Running by LLI (Low Level Virtual Machine Interpreter & Dynamic Compiler) Native .so library Linux Kernel Why LLVM? Could run llvm-application as the performance of native application Could generate small size BitCode, translate to target platform assembly code then compiled into native execution file (final size would be almost the same as you compile it directly from source by GCC or other compiler.) Support C/C++/… program to seamlessly execute on variable hardware platform. x86, ARM, MIPS,PowerPC,Sparc,XCore,Alpha…etc Google would apply it into Android and Browser (Native Client) The LLVM Compiler Work-Flows. llc -mcpu=x86-64 C/C++ Assembly clang -emit-llvm llc -mcpu=cortex-a9 Java Fortran BitCode Assembly X86 LLVM Compiler gcc X86 Execution File arm-none-linux-gnueabi-gcc -mcpu=cortex-a9 ARM ARM Assembly Execution File ...etc LLVM in Mobile Device C/C++ BitCode Java ByteCode BitCode Render Script BitCode Application LLVM in Browser Chromium Browser HTML/Java Script SRPC NPAPI Would passed the security checking before execution. IMC UnTrust Part Native Client APP Call to run-time framework Service Framework SRPC IMC Storage IMC : Inter-Module Communications Service SRPC : Simple RPC NPAPI : Netscape Plugin Application Programming Interface Trust Part LLVM Compiler Demo. Use clang to compile BitCode File. [root@localhost reference_code]# clang -O2 -emit-llvm sample.c -c -o sample.bc [root@localhost reference_code]# ls -l sample.bc -rw-r--r--. 1 root root 1956 May 12 10:28 sample.bc Convert BitCode File to x86-64 platform assembly code. [root@localhost reference_code]# llc -O2 -mcpu=x86-64 sample.bc -o sample.s Compiler the assembly code to x86-64 native execution file. [root@localhost reference_code]# gcc sample.s -o sample -ldl [root@localhost reference_code]# ls -l sample -rwxr-xr-x. 1 root root 8247 May 12 10:36 sample Convert BitCode File to ARM Cortext-A9 platorm assembly code. [root@localhost reference_code]# llc -O2 -march=arm -mcpu=cortex-a9 sample.bc -o sample.s Compiler the assembly code to ARM Cortext-A9 native execution file. [root@localhost reference_code]# arm-none-linux-gnueabi-gcc -mcpu=cortex-a9 sample.s -ldl -o sample [root@localhost reference_code]# ls -l sample -rwxr-xr-x. 1 root root 6877 May 12 10:54 sample What is the problems for LLVM? Let’s see a simple sample code. LLVM dlopen/dlsymc Sample. [root@www LLVM]# clang -O2 -emit-llvm dlopen.c -c -o dlopen.bc [root@www LLVM]# lli dlopen.bc libraryHandle:86f5e4c8h puts function pointer:85e81330h loda int (*puts_fp)(const char *); int main() { void * libraryHandle; libraryHandle = dlopen("libc.so.6", RTLD_NOW); printf("libraryHandle:%xh\n",(unsigned int)libraryHandle); puts_fp = dlsym(libraryHandle, "puts"); printf("puts function pointer:%xh\n",(unsigned int)puts_fp); puts_fp("loda"); return 0; } Make execution code as data buffer Would place the piece of machine code as a data buffer to verify the native/LLVM run-time behaviors. 0000000000000000 <AsmFunc>: 0: 55 push %rbp 1: 48 89 e5 mov %rsp,%rbp 4: b8 04 00 00 00 mov $0x4,%eax 9: bb 01 00 00 00 mov $0x1,%ebx e: b9 00 00 00 00 mov $0x0,%ecx f: R_X86_64_32 gpHello 13: ba 10 00 00 00 mov $0x10,%edx 18: cd 80 int $0x80 1a: b8 11 00 00 00 mov $0x11,%eax 1f: c9 leaveq 20: c3 retq Native Program Run Code in Data Segment [root@www LLVM]# gcc self-modify.c -o self-modify [root@www LLVM]# ./self-modify Segmentation fault int (*f2)(); char TmpAsmCode[]={0x90,0x55,0x48,0x89,0xe5,0xb8,0x04,0x00,0x00,0x00,0xbb,0x01,0x00,0x00,0x00,0xb9,0x4 0,0x0c,0x60,0x00,0xba,0x10,0x00,0x00,0x00,0xcd,0x80,0xb8,0x11,0x00,0x00,0x00,0xc9,0xc3}; char gpHello[]="Hello Loda!ok!\n"; int main() { int vRet; unsigned long vpHello=(unsigned long)gpHello; TmpAsmCode[19]=vpHello>>24 & 0xff; TmpAsmCode[18]=vpHello>>16 & 0xff; TmpAsmCode[17]=vpHello>>8 & 0xff; TmpAsmCode[16]=vpHello & 0xff; f2=(int (*)())TmpAsmCode; vRet=f2(); printf("vRet=:%d\n",vRet); return 0; } Native Program Run Code in Data Segment with Page EXEC-settings [root@www LLVM]# gcc self-modify.c -o self-modify [root@www LLVM]# ./self-modify Hello Loda!ok! vRet=:17 int (*f2)(); char TmpAsmCode[]={0x90,0x55,0x48,0x89,0xe5,0xb8,0x04,0x00,0x00,0x00,0xbb,0x01,0x00,0x00,0x00,0xb9,0x4 0,0x0c,0x60,0x00,0xba,0x10,0x00,0x00,0x00,0xcd,0x80,0xb8,0x11,0x00,0x00,0x00,0xc9,0xc3}; char gpHello[]="Hello Loda!ok!\n"; int main() { int vRet; unsigned long vpHello=(unsigned long)gpHello; unsigned long page = (unsigned long) TmpAsmCode & ~( 4096 - 1 ); if(mprotect((char*) page,4096,PROT_READ | PROT_WRITE | PROT_EXEC )) perror( "mprotect failed" ); TmpAsmCode[19]=vpHello>>24 & 0xff; TmpAsmCode[18]=vpHello>>16 & 0xff; TmpAsmCode[17]=vpHello>>8 & 0xff; TmpAsmCode[16]=vpHello & 0xff; f2=(int (*)())TmpAsmCode; vRet=f2(); printf("vRet=:%d\n",vRet); return 0; } LLVM AP Run Code in Data Segment with EXEC-settings [root@www LLVM]# clang -O2 -emit-llvm llvm-self-modify.c -c -o llvm-self-modify.bc [root@www LLVM]# lli llvm-self-modify.bc Hello Loda!ok! int (*f2)(); vRet=:17 char TmpAsmCode[]={0x90,0x55,0x48,0x89,0xe5,0xb8,0x04,0x00,0x00,0x00,0xbb,0x01,0x00,0x00,0x00,0xb9,0x40,0x0c,0x60,0x00,0xb a,0x10,0x00,0x00,0x00,0xcd,0x80,0xb8,0x11,0x00,0x00,0x00,0xc9,0xc3}; char gpHello[]="Hello Loda!ok!\n"; int main() { int vRet; unsigned long vpHello=(unsigned long)gpHello; unsigned long page = (unsigned long) TmpAsmCode & ~( 4096 - 1 ); if(mprotect((char*) page,4096,PROT_READ | PROT_WRITE | PROT_EXEC )) perror( "mprotect failed" ); char *base_string=malloc(256); strcpy(base_string,gpHello); vpHello=(unsigned long)base_string; TmpAsmCode[19]=vpHello>>24 & 0xff; TmpAsmCode[18]=vpHello>>16 & 0xff; TmpAsmCode[17]=vpHello>>8 & 0xff; TmpAsmCode[16]=vpHello & 0xff; f2=(int (*)())TmpAsmCode; vRet=f2(); printf("vRet=:%d\n",vRet); return 0; } LLVM AP Run Code in Data Segment without EXEC-settings? [root@www LLVM]# clang -O2 -emit-llvm llvm-self-modify.c -c -o llvm-self-modify.bc [root@www LLVM]# lli llvm-self-modify.bc Hello Loda!ok! It still works! vRet=:17 int (*f2)(); char TmpAsmCode[]={0x90,0x55,0x48,0x89,0xe5,0xb8,0x04,0x00,0x00,0x00,0xbb,0x01,0x00,0x00,0x00,0xb9,0x40,0x0c,0x60,0x00,0xb a,0x10,0x00,0x00,0x00,0xcd,0x80,0xb8,0x11,0x00,0x00,0x00,0xc9,0xc3}; char gpHello[]="Hello Loda!ok!\n"; int main() { int vRet; unsigned long vpHello=(unsigned long)gpHello; char *base_string=malloc(256); strcpy(base_string,gpHello); vpHello=(unsigned long)base_string; TmpAsmCode[19]=vpHello>>24 & 0xff; TmpAsmCode[18]=vpHello>>16 & 0xff; TmpAsmCode[17]=vpHello>>8 & 0xff; TmpAsmCode[16]=vpHello & 0xff; f2=(int (*)())TmpAsmCode; vRet=f2(); printf("vRet=:%d\n",vRet); return 0; } So…..What we got? LLVM could run data-segment as execution code. LLVM doesn’t provide a strict sandbox to prevent the unexpected program flows. For installed-application, maybe it is ok. (could protect by Android Kernel-Level Application Sandbox) How about LLVM running in Web Browser? Code LLVM BitCode AP Running by LLI (Low Level Virtual Machine Interpreter & Dynamic Compiler) Bidirectional Function Call Data Technology always come from humanity!!! Native Client(Nacl) - a vision of the future Provide the browser to run web application in native code. Based on Google’s sandbox, it would just drop 5% performance compared to original native application. Could be available in Chrome Browser already. The Native Client SDK only support the C/C++ on x86 32/64 bits platform. Provide Pepper APIs (derived from Mozilla NPAPI). Pepper v2 added more APIs. Hack Google's Native Client and get $8,192 http://www.zdnet.com/blog/google/hackgoogles-native-client-and-get-8192/1295 Security of Native Client Data integrity Native Client's sandbox works by validating the untrusted code (the compiled Native Client module) before running it No support for process creation / subprocesses You can call pthread No support for raw TCP/UDP sockets (websockets for TCP and peer connect for UDP) No unsafe instructions inline assembly must be compatible with the Native Client validator (could use ncval utility to check) http://code.google.com/p/nativeclient/issues/list How Native Client Work? Browsing WebPage with Native Client. Chromium Browser Launch nacl64.exe to Execute the NaCl Executable (*.NEXE) file. Main Process and Dynamic Library C:\Users\loda\AppData\Local\Temp 6934.Tmp (=libc.so.3c8d1f2e) 6922.Tmp (=libdl.so.3c8d1f2e) 6933.tmp (=libgcc_s.so.1) 6912.tmp (=libpthread.so.3c8d1f2e) 67D8.tmp (=runnable-ld.so) 66AE.tmp (=hello_loda.nmf) 6901.Tmp (= hello_loda_x86_64.nexe) Chromium Browser lib64/libc.so.3c8d1f2e lib64/libdl.so.3c8d1f2e lib64/libgcc_s.so.1 lib64/libpthread.so.3c8d1f2e lib64/runnable-ld.so hello_loda.html hello_loda.nmf hello_loda_x86_32.nexe hello_loda_x86_64.nexe Download the main process and dynamic run-time libraries. Server provided Native Client Page Dynamic libraries Inheritance relationship Hello Loda Process (.NEXE) libpthread.so.3c8d1f2e libgcc_s.so.1 libdl.so.3c8d1f2e libc.so.3c8d1f2e runnable-ld.so =(ld-nacl-x86-64.so.1) Portable Native Client (PNaCl) PNaCl (pronounced "pinnacle") Based on LLVM to provided an ISA-neutral format for compiled NaCl modules supporting a wide variety of target platforms without recompilation from source. Support the x86-32, x86-64 and ARM instruction sets now. Still under the security and performance properties of Native Client. LLVM and PNaCl Refer from Google’s ‘PNaCl Portable Native Client Executables ’ document. PNaCl Shared Libraries libtest.c app.c Libtest.pso App.bc App.pexe pnacl-translate Translate to native code Libtest.so pnacl-translate App.nexe Execute under Native Client RunTime Environment http://www.chromium.org/nativeclient/pnacl/pnacl-shared-libraries-final-picture Before SFI Trust with Authentication Such as the ActiveX technology in Microsoft Windows, it would download the native web application plug-in the browser (MS Internet Explorer). User must authorize the application to run in browser. User-ID based Access Control Android Application Sandbox use Linux user-based protection to identify and isolate application resources. Each Android application runs as that user in a separate process, and cannot interact with each other under the limited access to the operating system.. General User/Kernel Space Protection Process individual memory space User Space RPC Application #1 UnTrust Code Trust Code Process individual memory space User Space RPC Application #2 Process individual memory space User Space Application #3 Application could use kernel provided services by System Call Kernel Space Device Drivers and Kernel Modules Fault Isolation CFI (CISC Fault Isolation) Based on x86 Code/Data Segment Register to reduce the overhead, NaCl CFI would increase around 2% overhead. SFI NaCl SFI would increase 5% overhead in Cortex A9 outof-order ARM Processor, and 7% overhead in x86_64 Processor. 1,ARM instruction length is fixed to 32-bits or 16bits (depend on ARMv32,Thumb or Thumb2 ISA) 2,X86 instruction length is variable from 1 to 1x bytes. CISC Fault Isolation Data/Code Dedicated Register= (Target Address & And-Mask Register) | Segment Identifier Dedicated Register Address SandBoxing Running Memory Space Address SandBoxing X86_64 Native Client running under specific 4GB memory 32bits 4GB Memory Space Software Fault Isolation Process individual memory space Running in Software Fault Isolation Model User Space SFI Trust Code UnTrust Code Trust Code Call Gate Return Gate User Space SFI UnTrust Code Application could use kernel provided services by System Call Kernel Space Device Drivers and Kernel Modules NaCl SFI SandBox NaCl would download the whole execution environment (with dynamic libraries) Would use x86_64 environment as the verification sample. Each x86_64 App would use 4GB memory space. But for ARM App, it would only use 0-1GB memory space. x86_64 R15 Registers would be defined as “Designated Register RZP” (Reserved Zero-address base Pointer),and initiate as a 4GB aligned base address to map the UnTrust Memory space. For the UnTrust Code, R15 Registers is readonly. RSP/RBP Register Operation The modification of 64bits RSP/RBP would be replaced by a set instructions to limit the 64bits RSP/RBP would be limited in allowed 32bits range. testB(0x1234); mov $0x1234,%edi callq 400624 <testB> int testB(int I) { int vRet; void *libraryHandle; … } //by x86_64 GCC 0000000000400624 <testB>: 400624: 55 push %rbp 400625: 48 89 e5 mov %rsp,%rbp 400628: 48 83 ec 20 sub $0x20,%rsp 40062c: 89 7d ec mov %edi,-0x14(%rbp) //by x86_64 pnacl-clang Function in 32bytes Alignment 00000000010003e0 <testB>: 10003e0: 55 push %rbp 10003e1: 48 89 e5 mov %rsp,%rbp Clear RSP high-32bits 10003e4: 83 ec 10 sub $0x10,%esp 10003e7: 4a 8d 24 3c lea (%rsp,%r15,1),%rsp Use R15 to limit RSP 10003eb: 89 7d fc mov %edi,-0x4(%rbp) in specific 4GB space Function Call (1/3) The function target address would be 32 bytes alignment, and limit the target address to allowed 32bits range by R15. char gpHello[]="Hello Loda!\n"; int (*f1)(const char *); …… f1 = dlsym(libraryHandle, "puts"); f1(gpHello); //by x86_64 GCC 40067c: e8 77 fe ff ff callq 4004f8 <dlsym@plt> //rax = function puts address 40068f: bf d0 0b 60 00 mov $0x600bd0,%edi 400694: ff d0 callq *%rax //by x86_64 pnacl-clang 100045b: e8 20 fd ff ff callq 1000180 <dlsym@plt+0x120> //rax = function puts address 1000466: bf 00 02 01 11 mov $0x11010200,%edi ……. Clear RAX high-32bits 1000478: 83 e0 e0 and $0xffffffe0,%eax 100047b: 4c 01 f8 add %r15,%rax Use R15 to limit RAX 100047e: ff d0 callq *%rax in specific 4GB space Function Call (2/3) Call function from ELF PLT (Procedure Linkage Table) The function target address would be 32 bytes alignment, and limit the target address to allowed 32bits range by R15. callq 4004f8 <dlsym@plt> int (*f1)(const char *); …… f1 = dlsym(libraryHandle, "puts"); //by x86_64 GCC 4004f8: ff 25 ba 06 20 00 jmpq *0x2006ba(%rip) # 600bb8 <_GLOBAL_OFFSET_TABLE_+0x38> 4004fe: 68 04 00 00 00 pushq $0x4 400503: e9 a0 ff ff ff jmpq 4004a8 <_init+0x18> callq 1000180 <dlsym@plt+0x120> //by x86_64 pnacl-clang 1000080: 4c 8b 1d 49 01 01 10 mov 0x10010149(%rip),%r11 #110101d0<_GLOBAL_OFFSET_TABLE_+0x20> R11 in 32bytes Alignment 1000087: 45 89 db mov %r11d,%r11d 100008a: 41 83 e3 e0 and $0xffffffe0,%r11d 100008e: 4d 01 fb add %r15,%r11 Use R15 to limit R11 1000091: 41 ff e3 jmpq *%r11 in specific 4GB space Function Call (3/3) For the internal UnTrust function directly calling, it doesn’t need to filter by the R15 vRet=0x1234*testA(0x888); //by x86_64 GCC 400696: bf 88 08 00 00 40069b: e8 54 ff ff ff Directly call to 69 c0 34 12 00 00 UnTrust function. 4006a0: //by x86_64 pnacl-clang 1000480: bf 88 08 00 00 ………………….. 100049b: e8 e0 fe ff ff 10004a0: 69 c0 34 12 00 00 mov $0x888,%edi mov $0x888,%edi callq 4005f4 <testA> imul $0x1234,%eax,%eax Directly call to UnTrust function. callq 1000380 <testA> imul $0x1234,%eax,%eax Function Return The function return address would be 32 bytes alignment, and limit the target address to allowed 32bits range by R15. int testB(int I) { int vRet; …………… vRet=0x1234*testA(0x888); return vRet; } //by x86_64 pnacl-clang 100049b: e8 e0 fe ff ff 10004a0: 69 c0 34 12 00 00 10004a6: 89 45 f8 10004a9: 83 c4 10 10004ac: 4a 8d 24 3c 10004b0: 8b 2c 24 10004b3: 4a 8d 6c 3d 00 10004b8: 83 c4 08 10004bb: 4a 8d 24 3c 10004bf: 59 10004c0: 83 e1 e0 10004c3: 4c 01 f9 10004c6: ff e1 //by x86_64 GCC 40069b: e8 54 ff ff ff 4006a0: 69 c0 34 12 00 00 4006a6: 89 45 f4 4006a9: 8b 45 f4 4006ac: c9 4006ad: c3 callq 4005f4 <testA> imul $0x1234,%eax,%eax mov %eax,-0xc(%rbp) mov -0xc(%rbp),%eax leaveq retq callq 1000380 <testA> imul $0x1234,%eax,%eax mov %eax,-0x8(%rbp) add $0x10,%esp Clear RSP high-32bits lea (%rsp,%r15,1),%rsp Use R15 to limit RSP mov (%rsp),%ebp lea 0x0(%rbp,%r15,1),%rbp in specific 4GB space add $0x8,%esp lea (%rsp,%r15,1),%rsp pop %rcx RCX in 32bytes Alignment and $0xffffffe0,%ecx add %r15,%rcx Use R15 to limit RCX (return address) jmpq *%rcx in specific 4GB space Read/Write Memory The Read/Write Memory address would be limited in the 4GB 32bits range by R15. int testA(int I) { int vRet; char *p1=0x10000000; char *p2=0x20000000; char *p3; char p4; p3=(char *)((int)p1*2+p2); p3[0x11]=0x66; //Write p4=p3[0x22]; //Read …. } 0000000001000380 <testA>: …… 1000387: c7 45 f4 00 00 00 10 movl $0x10000000,-0xc(%rbp) 100038e: c7 45 f0 00 00 00 20 movl $0x20000000,-0x10(%rbp) ……………….. //Write Clear RAX high-32bits Use R15 to limit RAX in specific 4GB space 10003a3: 89 c0 mov %eax,%eax 10003a5: 41 c6 84 47 11 00 00 movb $0x66,0x20000011(%r15,%rax,2) ……………… 10003b1: 10003b3: 10003b8: 89 c0 41 8a 44 07 22 88 45 eb //Read Clear RAX high-32bits Use R15 to limit RAX in specific 4GB space mov %eax,%eax mov 0x22(%r15,%rax,1),%al mov %al,-0x15(%rbp) NaCl Software Fault Isolation in ARM Trust Code Region UnTrust Code Region 3-1GB 0-1GB NaCl ARMv32 Instruction Software Fault Isolation Memory SP Register Operation in ARMv32 The modification of SP would be replaced by a set instructions to limit the 32bits SP would be below 1GB address testB(0x1234); 250: e3010234 .............. 25c: ebfffffe int testB(int I) { int vRet; void *libraryHandle; … } movw r0, #4660 bl ; 0x1234 160 <testB> Function in 16bytes Alignment //by armv7 pnacl-clang 00000160 <testB>: 160: e92d4800 push {fp, lr} 164: e24dd010 sub sp, sp, #16 168: e3cdd103 bic sp, sp, #-1073741824 ; 0xc0000000 limit the SP address below 1GB address Function Call in ARMv32 The function target address would be 16 bytes alignment, and limit the target address below 1GB address. char gpHello[]="Hello Loda!\n"; int (*f1)(const char *); …… f1 = dlsym(libraryHandle, "puts"); f1(gpHello); //by armv7 pnacl-clang 1bc: ebfffffe bl <dlsym> …… r1 in 16bytes Alignment and limit the address below 1GB address 1e8: e3c1113f bic r1, r1, #-1073741809 ; 0xc000000f 1ec: e12fff31 blx r1 Function Return in ARMv32 Don’t allow “pop {pc}” directly. The function return address would be 16 bytes alignment, and limit the target address in LR below 1GB address. int testB(int I) { int vRet; …………… vRet=0x1234*testA(0x888); return vRet; } limit the SP address below 1GB address //by armv7 pnacl-clang 214: e3cdd103 bic sp, sp, #-1073741824 ; 0xc0000000 218: e8bd4800 pop {fp, lr} 21c: e320f000 nop {0} 220: e3cee13f bic lr, lr, #-1073741809 ; 0xc000000f 224: e12fff1e bx lr LR in 16bytes Alignment and limit the address below 1GB address Read/Write Memory in ARMv32 The Read/Write Memory address would be limited below 1GB address by R0. 00000110 <testA>: 110: e24dd018 114: e3cdd103 118: e3a01201 ............ int testA(int I) 120: e58d100c { 124: e3a01202 128: e59d000c int vRet; 12c: e3a0207b char *p1=0x10000000; 130: e58d1008 char *p2=0x20000000; ............... char *p3; 148: e3a01066 char p4; 14c: e320f000 p3=(char *)((int)p1*2+p2); 150: e3c00103 p3[0x11]=0x66; //Write 154: e5c01000 ............. p4=p3[0x22]; //Read …. } 164: e3c00103 168: e5d00022 16c: e5cd0003 ............... sub sp, sp, #24 bic sp, sp, #-1073741824 ; 0xc0000000 mov r1, #268435456 ; 0x10000000 str mov ldr mov str r1, [sp, #12] r1, #536870912 ; 0x20000000 r0, [sp, #12] r2, #123 ; 0x7b r1, [sp, #8] //Write mov r1, #102 ; 0x66 limit the ro address nop {0} below 1GB address bic r0, r0, #-1073741824 ; 0xc0000000 strb r1, [r0] //Read limit the ro address below 1GB address bic r0, r0, #-1073741824 ; 0xc0000000 ldrb r0, [r0, #34] ; 0x22 strb r0, [sp, #3] For Hacker’s View Inner sandbox: binary validation Would limit stack in specific 4GB memory space int func(…..) { Use R15 to limit UnTrust function call in specific 4GB space …………………………. } Return to caller function. Would limit in 32bytes alignment and specific 4GB memory space. outer sandbox: OS system-call interception Conclusion LLVM support IR and could run on variable processor platforms. Portable native client with LLVM should be a good candidate to play role in Browser usage ,and the LLVM would have a good role on Android platform. It is a new security protection model, use user-space Sandbox to run native code and validate the native instruction without kernel-level privilege involved. Break the Sandbox!!! Appendix The differences of Dalvik and LLVM (1/2) From compiled execution code LLVM transfer to 100% native code. Dalvik VM need to based on the JIT Trace-Run Counter. From the JIT native-code re-used After Dalvik VM process restart, the JIT Trace-Run procedures need to perform again. But after LLVM application transfer to 100% native code, it could run as native application always. From CPU run-time loading Dalvik application need to calculate the Trance-Run Counter in run-time and perform JIT. LLVM-based native application could save this extra CPU loading. The differences of Dalvik and LLVM (2/2) From the run-time memory footprint Dalvik application convert to JIT native code would need extra memory as JIT-Cache. If user use Clang to compile C code as BitCode and then use LLVM compiler to compile the BitCode to native assembly, it could save more run-time memory usage. If Dalvik application transfer the loading to JNI native .so library, it would need extra loading for developer to provide .so for different target processors’ instruction. From the Storage usage General Dalvik application need a original APK with .dex file and extra .odex file in dalvik-cache. But LLVM application doesn’t need it. From the system security view of point LLVM support pointer/function-pointer/inline-assembly and have the more potential security concern than Java. Native Client Page Content { "files": { "libgcc_s.so.1": { "x86-64": { "url": "lib64/libgcc_s.so.1" }, ..... }, "main.nexe": { "x86-64": { "url": "hello_loda_x86_64.nexe" }, ..... }, <html> <body ...> .... <div id="listener"> ..... <embed name="nacl_module" id="hello_loda" width=200 height=200 src="hello_loda.nmf" type="application/x-nacl" /> </div> </body> </html> "libdl.so.3c8d1f2e": { ..... }, "libc.so.3c8d1f2e": { ..... }, "libpthread.so.3c8d1f2e": { ..... }, "program": { "x86-64": { "url": "lib64/runnable-ld.so" }, ..... } } End