Low level Programming Linux ABI • System Calls – Everything distills into a system call • /sys, /dev, /proc read() & write() syscalls • What is a system call? – Special purpose function call • Elevates privilege • Executes function in kernel – But what is a function call? What is a function call? • Special form of jmp – Execute a block of code at a given address – Special instruction: call <fn-address> – Why not just use jmp? • What do function calls need? – int foo(int arg1, char * arg2); • Location: foo() • Arguments: arg1, arg2, … • Return code: int – Must be implemented at hardware level Hardware implementation int foo(int arg1, char * arg2) { return 0; } 0000000000000107 <foo>: 107: 55 push %rbp 108: 48 89 e5 mov %rsp,%rbp 10b: 89 7d fc mov %edi,-0x4(%rbp) 10e: 48 89 75 f0 mov %rsi,-0x10(%rbp) 112: b8 00 00 00 00 mov $0x0,%eax 117: c9 leaveq 118: c3 retq • Location • Address of function + ret instruction • Arguments • Passed in registers (which ones? And why those?) • Return code • Stored in register: EAX • To understand this we need to know about assembly programming… Assembly basics • What makes up assembly code? – Instructions • Architecture specific – Operands • Registers • Memory (specified as an address) • Immediates – Conventions • Rules of the road and/or behavior models Registers • General purpose – 16bit: AX, BX, CX, DX, SI, DI – 32 bit: EAX, EBX, ECX, EDX, ESI, EDI – 64 bit: RAX, RBX, RCX, RDX, RSI, RDI + others • Environmental – RSP, RIP – RBP = frame pointer, defines local scope • Special uses – Calling conventions • RAX == return code • RDI, RSI, RDX, RCX… == ordered arguments – Hardware defined • Some instructions implicitly use specific registers – RSI/RDI String instructions – RBP leaveq Memory • X86 provides complex memory addressing capabilities – Immediate addressing • mov %rsi, ($0xfff000) – Direct addressing • mov %rsi, (%rbp) – Offset Addressing • mov %rsi, $0x8(%rax) • Base + (Index * Scale) + Displacement – – – – A.K.A. SIB Occasionally seen Hardly ever used by hand movl %ebp, (%rdi,%rsi,4) • Address = rdi + rsi * 4 – A more complicated example • segment:disp(base, index, scale) 8/16/32/64 bit operands • Programmer explicitly specifies operand length in operand • Example: mov reg, reg – – – – 8 bits: movb %al, %bl 16 bits: movw %ax, %bx 32 bits: movl %eax, %ebx 64 bits: movq %rax, %rbx • What about “movl %ebx, (%rdi)”? Function call implementation We can now decode what is going on here int foo(int arg1, char * arg2) { return 0; } 0000000000000107 <foo>: 107: 55 push %rbp 108: 48 89 e5 mov %rsp,%rbp 10b: 89 7d fc mov %edi,-0x4(%rbp) 10e: 48 89 75 f0 mov %rsi,-0x10(%rbp) 112: b8 00 00 00 00 mov $0x0,%eax 117: c9 leaveq 118: c3 retq • Location • Address of function + ret instruction • Arguments • Passed in registers (which ones? And why those?) • Return code • Stored in register: EAX OS development requires assembly programming • OS operations are not typically expressible with a higher level language – Examples: atomic operations, page table management, configuring segments, • System calls(!) • How to mix assembly with OS code (in C) – Compile with assembler and link with C code • .S files compiled with gas – Inline w/ compiler support • .c files compiled with gcc Implementing assembler functions • C functions: – Location, args, return code • ASM functions: – Location only – Programmer must implement everything else • Arguments, context, return values • Everything in foo() from before + function body • Programmer takes place of compiler – Must match calling conventions Calling assembler functions • Programmer implements calling convention – Behaves just like a regular function • Only need location – Linker takes care of the rest Defines a global variable .globl foo foo: push %rbp mov %rsp, %rbp … foo.S extern int foo(int, char *); int main() { int x = foo(1, “test”); } main.c Inline • OS only needs a few full blown assembly functions – Context switches, interrupt handling, a few others • Most of the time just need to execute a single instruction – i.e. set a bit in this control register • GCC provides ability to incorporate inline assembly instructions into a regular .c file – Not a function – Compiler handles argument marshaling Overview • Inline assembly includes 2 components – Assembly code – Compiler directives for operand marshaling asm ( assembler template : output operands : input operands : list of clobbered registers ); /* optional */ /* optional */ /* optional */ Inline assembly execution • Sequence of individual assembly instructions – Can execute any hardware instruction – Can reference any register or memory location – Can reference specified variables in C code • 3 Stages of execution 1. Load C variables into correct registers or memory 2. Execute assembly instructions 3. Copy register and memory contents into C variables Specifying inline operands • How does compiler copy C variables to/from registers? • C variables and registers are explicitly linked in asm specification – Sections for input and output operands – Compiler handles copying to and from variables before and after assembly executed – Assembly code references marshaled values (index of operand) instead of raw registers Operand Codes • Wide range of operand codes (“constraints”) are available – Input: “code”(c-variable) – Output: “=code”(c-variable) a b c d S D = = = = = = %rax, %rbx, %rcx, %rdx, %rsi, %rdi, %eax, %ebx, %ecx, %edx, %esi, %edi, %ax %bx %cx %dx %si %di Explicit Register codes r q m f i g = = = = = = Any register a, b, c, d regs memory operand floating point reg immediate anything Other Operand codes And many more…. Register example int foo(int arg1, char * arg2) { int a=10, b; asm ("movl %1, %%ecx;\n“ “movl %%ecx, %0;\n" : ”=b"(b) /* output */ : “a"(a) /* input */ : ); return 0; } What does this do? 0000000000000107 <foo>: 107: 55 push %rbp 108: 48 89 e5 mov %rsp,%rbp 10b: 53 push %rbx 10c: 89 7d e4 mov %edi,-0x1c(%rbp) 10f: 48 89 75 d8 mov %rsi,-0x28(%rbp) 113: c7 45 f0 0a 00 00 00 movl $0xa,-0x10(%rbp) 11a: 8b 45 f0 mov -0x10(%rbp),%eax 11d: 89 c1 mov %eax,%ecx 11f: 89 cb mov %ecx,%ebx 121: 89 d8 mov %ebx,%eax 123: 89 45 f4 mov %eax,-0xc(%rbp) 126: b8 00 00 00 00 mov $0x0,%eax 12b: 5b pop %rbx 12c: c9 leaveq 12d: c3 retq Memory example • X86 can also use memory (SIB, etc) operands – “m” operand code int foo(int arg1, char * arg2) { int a=10, b; asm ("movl "movl : : : ); return 0; } %1, %%ecx;\n" %%ecx, %0;\n" "=m"(b) "m"(a) 0000000000000107 <foo>: 0: 55 push %rbp 1: 48 89 e5 mov %rsp,%rbp 4: 89 7d ec mov %edi,-0x14(%rbp) 7: 48 89 75 e0 mov %rsi,-0x20(%rbp) b: c7 45 fc 0a 00 00 00 movl $0xa,-0x4(%rbp) 12: 8b 4d fc mov -0x4(%rbp),%ecx 15: 89 4d f8 mov %ecx,-0x8(%rbp) 18: b8 00 00 00 00 mov $0x0,%eax 1d: c9 leaveq 1e: c3 retq Input/output operands • Sometimes input and output operands are the same variable – Transform input variable in some way int foo(int arg1, char * arg2) { int a=10, b=5; asm (“addl %1, %0;\n" : "=r"(b) : "m"(a), "0"(b) : ); return 0; } 0000000000000107 <foo>: 0: 55 push %rbp 1: 48 89 e5 mov %rsp,%rbp 4: 89 7d ec mov %edi,-0x14(%rbp) 7: 48 89 75 e0 mov %rsi,-0x20(%rbp) b: c7 45 fc 0a 00 00 00 movl $0xa,-0x8(%rbp) 12: c7 45 fc 05 00 00 00 movl $0x5,-0x4(%rbp) 19: 8b 45 fc mov -0x4(%rbp),%eax 1c: 03 45 f8 add -0x8(%rbp),%eax 1f: 89 45 fc mov %eax,-0x4(%rbp) 22: b8 00 00 00 00 mov $0x0,%eax 27: c9 leaveq 28: c3 retq Input/output operands (2) • Input/output operands can also be specified with “+” int foo(int arg1, char * arg2) { int a=10, b=5; asm (“addl %1, %0;\n" : “+r"(b) : "m"(a) : ); return 0; } 0000000000000107 <foo>: 0: 55 push %rbp 1: 48 89 e5 mov %rsp,%rbp 4: 89 7d ec mov %edi,-0x14(%rbp) 7: 48 89 75 e0 mov %rsi,-0x20(%rbp) b: c7 45 fc 0a 00 00 00 movl $0xa,-0x8(%rbp) 12: c7 45 fc 05 00 00 00 movl $0x5,-0x4(%rbp) 19: 8b 45 fc mov -0x4(%rbp),%eax 1c: 03 45 f8 add -0x8(%rbp),%eax 1f: 89 45 fc mov %eax,-0x4(%rbp) 22: b8 00 00 00 00 mov $0x0,%eax 27: c9 leaveq 28: c3 retq Clobbered list • We cheated earlier… int foo(int arg1, char * arg2) { int a=10, b; asm ("movl "movl : : : ); • How does compiler know to save/restore ECX? – It doesn’t %1, %%ecx;\n" %%ecx, %0;\n" "=m"(b) "m"(a) return 0; } • We must explicitly tell compiler what registers have been implicitly messed with – In this case ECX, but other instructions have implicit operands (CHECK THE MANUALS) • Second set of constraints to inline assembly – Clobber list: Operands not used as either input or output but still must be saved/restored by compiler Why clobber list? • Why do we need this? – Compilers try to optimize performance • Cache intermediate values and assume values don’t change • Compiler cannot inspect ASM behavior – outside scope of compiler • Clobber lists tell compiler: – “You cannot trust the contents of these resources after this point” – Or “Do not perform optimizations that span this block on these resources” Using clobber lists int foo(int arg1, char * arg2) { int a=10, b; asm ("movl %1, %%ecx;\n" "movl %%ecx, %0;\n" : "=m"(b) : "m"(a) : “ecx”, “memory” ); return 0; } • ECX is used implicitly so its value must be saved/restored • What about “memory”? Back to system calls • Function calls not that special – Just an abstraction built on top of hardware • System calls are basically function calls – With a few minor changes • Privilege elevation • Constrained entry points – Functions can call to any address – System calls must go through “gates” Implementing system calls • System calls are implemented as a single function call: syscall() – read() and write() actually just invoke syscall() • What does syscall do? – Enters into the kernel at a known location – Elevates privilege – Instantiates kernel level environment • Once inside the kernel, an appropriate system call handler is invoked based on arguments to syscall() x86 and Linux • Number of different mechanisms for implementing syscall – Legacy: int 0x80 – Invokes a single interrupt handler – 32 bit: SYSENTER – Special instruction that sets up preset kernel environment – 64 bit: SYSCALL – 64 bit version of SYSENTER • All jump to a preconfigured execution environment inside kernel space – Either interrupt context or OS defined context • What about arguments? – syscall(int syscall_num, args…) Specific system calls • Each system call has a number assigned to it – Index into a system call table • Function pointers referencing each syscall handler • Syscall(int syscall_num, args…) – Sets up kernel environment – Invokes syscall_table[syscall_num](args…); – Returns to user space: • Resets environment to state before call