Quiz 1: 39 Avg., 19 Std. Dev. 40 30 20 10 0 10 20 30 40 50 60 70 80 90 100 Linking CS 140 Feb. 11, 2015 Ali Jose Mashtizadeh Outline • Overview • Detailed Example • Shared Libraries • Optimizations • Security • Summary Compiler Toolchain foo.c cc foo.S as foo.o bar.c cc bar.S as bar.o baz.S as baz.o ld a.out • How to reference functions across files? • Dynamic libraries • File Formats • Security Considerations Perspective on Memory • Programming Language: (x += 1; add $1, %eax) • Instructions: Operations to perform • Variables: Mutable operands • Constants: Immutable operands • Hardware: • Executable: Binary code, Just-in-Time compiled code • Read-Only: Constants • Read-Write: Variables, Stack, Heap • Hardware accesses variables/code by address • Linkers, Binary loader, Runtime determine this Process Specification • Executable file formats: ELF, aout, COFF, PE, MachO • Specification between linker/loader/OS • Explains how and where to load code and data • Linker builds executable from the object files: code length, Header: Object code: Instructions ELF calls this .text main: call XXXX ret bar: ret Exported Symbols: ELF calls this .sym main: 0 bar: 40 Relocations: external refs, ELF .text.rel 4: foo ... Program Loading prog.o ld Compile Time a.out loader Load/Runtime • Most POSIX systems use a loader • Loader maps code and data into memory • Linker for dynamic libraries (later) • ELF lets you suggest a loader to the kernel • Optimizations: • Zero-initialized data not written or read • Demand load: wait until first use to load/link • Copy-on-write/Sharing – read-only data & code In-Memory Process What does a process look like? • Process address space segments: • • • • • • Code (.text) Data (.data) Zeroed Data (.bss) Read-only Data (.rodata) Heap (Not in binary) Stack (Not in binary) Userspace • ELF has simple loader table Kernel Stack Heap Data Code Who sets up what? • Global code/data: • Generated by the compiler/linker • Loading into memory by loader • Read-only Data: • Space is mmap’ed by loader • Stack: • mmap’ed by kernel, user/kernel for additional threads • stack is allocated/freed by procedures • Compiler determines per-function stack usage statically • Heap: • Runtime allocation managed by malloc Process Creation on Windows • The spawning application calls into ntdll.dll • Ntdll.dll determines application type: • POSIX, Command Line, OS/2, DOS, Win32 • Runtime may be spawned if different than current: • posix.exe, cmd.exe, ntvdm.exe • Load memory into process space • Tells kernel to enter initialization routine in process Detailed Example Example: Compiler • Simple hello world program • Compile with: % cc –S hello.c • -S compiles but does not assemble file • Output hello.S has symbolic reference to printf .section .rodata .LC0: .string “hello!\n” .text .globl main main: int main(...) enter $4, $0 { movl $.LC0, (%esp) printf(“hello!\n”); call printf } leave ret Example: Assembler • call printf is compiled as call $0 • Assembler assembles each file at address 0x0 • Outputs symbol and relocation tables • % as hello.S • % objdump –d hello.o 0x0000 main: 0x0000 0x0003 0x000A 0x000F 0x0010 enter $4, $0 movl $.LC0, (%esp) call $0 leave ret Example: Linker • Linker must fix the reference to printf • % ld hello.o –o hello • % objdump –d hello 0x00400100 0x00400100 0x00400103 0x0040010A 0x0040010F 0x00400110 main: 0x00600000 LC0: enter $4, $0 movl $0x00600000, (%esp) call $0x0043FC84 leave ret “hello!\n” Simple Linker • Pass 1: • • • • Coalesce sections with same name Arrange in memory Read symbol tables (maintain a global symbol table) Compute virtual address of sections (start and offset) • Pass 2: • Patch references using global symbol table • Emit result into a new object or binary • Emit loader table for loader (simplified view of sections) • Optionally: Symbol tables maybe discarded Linker Scripts • Tells linker how and where to load • Link with script: % ld –T linker.script foo.o • Output default script: % ld --verbose ENTRY(_init) OUTPUT_FORMAT(elf32-i386) SECTIONS { .text : ALIGN(0x1000) { *(.text) } /* Other sections */ } Linker Scripts Uses • Custom linker scripts for kernels • Used by kernels and apps to have special sections • Collect data structures into a specified section /* gcc/clang */ int inside_section __attribute__((section(“.data.special”)); • Per-thread data /* C11/C++11 */ thread_local int per_thread_global; Compiler and Linker Interaction • Code Model: Specifies where code will run • • • • • • • Compiler must choose the right assembly instructions These may be relative to where the code is located Negative vs Positive addresses Small code may use shorter relative addresses The linker can’t modify instructions usually Architecture specific small (code+data < 2GB), medium (code < 2GB), large (no restrictions), kernel (code > 2GB) • % cc –mcmodel=kernel pmap.c –o pmap.o ELF: File Header • Print header: % readelf –h a.out ELF Header: Magic: 7f 45 4c 46 02 01 01 09 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - FreeBSD ABI Version: 0 Type: EXEC (Executable file) Machine: Advanced Micro Devices X86-64 Version: 0x1 Entry point address: 0x400680 Start of program headers: 64 (bytes into file) Start of section headers: 3856 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 8 Size of section headers: 64 (bytes) Number of section headers: 28 Section header string table index: 25 ELF: Sections • Print sections: % readelf –S a.out Section Headers: [Nr] Name Size [ 0] 0000000000000000 ... [12] .text 0000000000000328 ... [22] .data 000000000000001c [23] .bss 0000000000000008 ... [26] .symtab 0000000000000678 [27] .strtab 0000000000000313 Type Address Offset EntSize Flags Link Info Align NULL 0000000000000000 00000000 0000000000000000 0 0 0 PROGBITS 0000000000400680 00000680 0000000000000000 AX 0 0 16 PROGBITS 0000000000600c78 00000c78 0000000000000000 WA 0 0 8 NOBITS 0000000000600c98 00000c94 0000000000000000 WA 0 0 8 SYMTAB 0000000000000000 00001610 0000000000000018 27 52 8 STRTAB 0000000000000000 00001c88 0000000000000000 0 0 1 ELF: Program Header • Print program header: % readelf –l a.out Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align PHDR 0x0000000000000040 0x0000000000400040 0x0000000000400040 0x00000000000001c0 0x00000000000001c0 R E 8 INTERP 0x0000000000000200 0x0000000000400200 0x0000000000400200 0x0000000000000015 0x0000000000000015 R 1 [Requesting program interpreter: /libexec/ld-elf.so.1] LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000 0x0000000000000a64 0x0000000000000a64 R E 200000 LOAD 0x0000000000000a68 0x0000000000600a68 0x0000000000600a68 0x000000000000022c 0x0000000000000238 RW 200000 ... • The OS and loader use these for loading • Loader uses sections when linking against libraries C++ & Name mangling • C++ has functions: • Same name but different types • Name mangling creates unique symbol per type • Compiler and/or version specific % nm // C++ 0000 int foo(int a) 0008 { return 0; } % nm int foo(int a, int b) 0000 { return 0; } 0008 foo.o T _Z3fooi T _Z3fooii foo.o | c++filt T foo(int) T foo(int, int) Shared Libraries Dynamic Linking • Shared libraries: • Make upgrading, bug fixing, and security patches easier • Reduces total code size installed • Plugins • ELF: Main binary specifies which loader to use: • BSD: /libexec/ld-elf.so.1 • Load-time or Run-time linking Static Shared Libraries • Programs often share many libraries like libc • *.a files are archives of object files created with ar • % ar –rc libc.a printf.o scanf.o ... ls sh cc libc.a libc.a libc.a Dynamic Shared Libraries • No need to be recompile software on libc changes • Must be compiled with –fpic (more on this next) • % ld –shared libc.so printf.o ... ls sh libc.a cc Compiler Flag -fpic • Compiler generates relocatable executables • Uses PC-relative addressing • Architecture specific • Linking shared libraries • Same procedure as linking a program binary • Different types of addressing modes are handled Position Independent Code • Loader has to patch every call into a library • Very slow loading times! • Instead we use indirection: Program main: ... call printf Procedure Linkage Table (PLT) printf: call GOT[5] Global Offset Table ... [5]: &printf (libc) Lazy Dynamic Linking • GOT Table points to dlfixup • Loader patches calls on first use Program main: ... call printf Procedure Linkage Table (PLT) printf: call GOT[5] Global Offset Table ... [5]: &dlfixup Explicit Dynamic Linking • Bind to a symbol at runtime • Used for loading plugins // Open dynamic library void *p = dlopen(“foo.so”, RTLD_LAZY); // Lookup symbol void (*fp)(void) = dlsym(p, “foo”); // Run function pointer fp(); Optimizations Link Time Optimization • Link Time Optimization • Compiler optimizations that cross modules • Inlining of code • Simplification, dead code elimination, etc. • Requires linking compiler intermediate representation (IR) • Clang supports this if you link llvm IR % clang –emit-llvm –c foo.c –o foo.o % clang –emit-llvm –c bar.c –o bar.o % clang foo.o bar.o –o a.out Profile-Guided Optimization • Collect performance profiling of code usage • Optimize code for size/performance based on runs Generate instrumented code % clang –fprofile-instr-generate foo.c Run and collect data % ./foo Giving clang profiling data % clang –fprofile-sample-use=foo.prof foo.c Security Attacks void fn() { char buf[80]; gets(buf); ... } 1. Attacker injects code into a buffer: Code usually tries to execute a shell 2. Overwrites return address using buffer Pointer points to code Linking and Security • No eXecute (NX): • Loader mark code as read-only • Stacks/Data marked as non-executable • Address Space Randomization: • Relocate executable to a different address on each load • Makes it harder for the attacker to determine addresses • Attacks usually require an information leak bug • Compiler Protection: • Stack protector (stack cookies): Check for buffer overflows • Bounds checking: Hard to enforce system calls • Control flow integrity: Verify code pointers ASLR: Compiling • Binaries compiled with special flags: • For shared/dynamic libraries: • % cc –fpic print.c –o printf.o • For static libraries and program binary: • % cc –fpie main.c –o main.o ASLR: Loading • Requires kernel+loader support for ASLR • Kernel randomized initial stack • Loader chooses random address to load at • Every library and program can be randomized • Heap should be randomized by libc • Requires exec() to rerandomize! Blind ROP Attack • Brute force attack for remote exploits • Requires approximately ~2k-4k requests • ASLR broken by “stack reading” • Defeatable if program always fork-exec’s • MySQL bug allows attackers to bypass ASLR • GOT/PLT simplifies searching for useful functions • Code & Paper: http://www.scs.stanford.edu/brop/ Linking Summary • Compiler/Assembler: • Generate one object file for each source file • Don’t know about memory layout and external refs • Linker: • Combines all object files into a library or executable • Determines memory layout • OS: • Loads loader and initial stack • Loader: • Loads binary and libraries into memory • Links shared libraries using GOT/PLT