CISC vs RISC Introduction to x86 Chia-Chi Teng Administrivia • Course evaluation: DO IT TODAY!!! – Part of your professionalism grade • • • • • • • Mid-term: take home only Available Wednesday morning on BB Due Saturday 24th midnight No class on Wednesday, no HW or LAB this week HW 6 due today Late work Quiz retake Review • MIPS assembly – Read code – Assembly -> machine code – Machine code -> assembly – Stack – C -> assembly – Assembly -> C • • • • Number representation Single cycle CPU Assembler/compiler/linker Performance Internet Worms “On July 19th, 2001, a self-propagating program, or worm, was released into the Internet. The worm, dubbed Code-Red v2, probed random Internet hosts for a documented vulnerability in the popular Microsoft IIS Web server. As susceptible hosts were infected with the worm, they too attempted to subvert other hosts, dramatically increasing the incidence of the infection. Over fourteen hours, the worm infected almost 360,000 hosts, reaching an incidence of 2,000 hosts per minute before peaking [1]. The direct costs of recovering from this epidemic (including subsequent strains of Code-Red) have been estimated in excess of $2.6 billion [2]” Source: Internet Quarantine: Requirements for Self-Propagating Code David Moore, Colleen Shannon, Geoffrey M. Voelker, Stefan Savage Morris Worm • Morris worm, aka the original Internet worm – Robert Morris, Cornell University, Nov 1986 • Exploit buffer overflow, just like code-red – DEC VAX, Sun systems • Infected 6,000 out of 60,000 computers connected to internet, that’s 10% Buffer overflow exercise • • • • HW 7-11 MIPS and x86 Understand stack structures Understand security risks Review: RISC - Reduced Instruction Set Computer • RISC philosophy – – – – fixed instruction lengths load-store instruction sets limited addressing modes limited operations • MIPS, Sun SPARC, HP PA-RISC, IBM PowerPC, Intel (Compaq) Alpha, … • Instruction sets are measured by how well compilers use them as opposed to how well assembly language programmers use them Design goals: speed, cost (design, fabrication, test, packaging), size, power consumption, reliability, memory space (embedded systems) MIPS = RISC = Load-Store architecture • Every operand must be in a register – Except for some small integer constants that can be in the instruction itself (see later) • Variables have to be loaded in registers • Results have to be stored in memory • Explicit Load and Store instructions are needed because there are many more variables than the number of registers 3/19/2016 8 Example • The HLL statements a=b+c d=a+b • will be “translated” into “pseudo” assembly language as: load b in register rx load c in register ry rz <- rx + ry store rz in a # not destructive; rz still contains the value of a rt <- rz + rx store rt in d 3/19/2016 9 Variable Instruction Length • See objdump example. Addressing Modes in x86 There are several addressing modes in IBM PC; some of them: Mode Operand Type Example Comment Register Register inc bx This inc's operand is a register Immediate Constant mov cx, 10 This mov's operands are register and immediate Memory Variable mov cx, [n] (n is an address) This mov's operands are register and memory Register Indirect Pointer pointed by a register mov cx, [bx] This mov's operands are register and register indirect Base Relative Pointer pointed by a register with an added index mov cx, [bx+1] This mov's operands are register and base relative Direct Indexed Pointer pointed by an index register with an added index mov cx, [si+1] This mov's operands are register and direct index Base Indexed Pointer pointed by a register and an index register mov cx, [bx+si] This mov's operands are register and base indexed Effective address • Effective address is the actual address the instruction is referring to • As you know, all variable names in assembly are treated as pointers • Putting a pointer in a square brackets always dereference its value – myvar refers to the address of myvar variable – [myvar] refers to the contents of myvar instead of its address – putting a register in a square brackets will treat the register as a pointer and then dereference the address of whatever that register points to – However, some compilers interpret myvar as [myvar], i.e. they assume that the programmers always want to use the value of the variable, not its address (e.g. move eax, inp would move the value of the variable inp to the register eax). Microsoft VisualStudio compiler is one of them. Another addressing mode More complex base indexed with offset addressing mode: Base Index Scale Offset EAX EAX EBX EBX ECX ECX 1 None EDX EDX 2 8 bit * ESP ESP 4 16 - bit 8 EBP EBP 32 bit ESI ESI EDI EDI Effective Address = Base +(Index * Scale) + Offset Data Unit • Byte – 8 bit • Word – 16 bit • DWord, Long – 32 bit Registers • 16-bit registers – ax: accumulator register – bx: base address register – cx: count register – dx: data register • 8-bit part: ah, al, … • 32-bit extended: eax, ebx, … • 64-bit Opteron: rax, rbx, … Index/Pointer Registers • Index – esi: source index – edi: destination index • Pointer – esp: stake pointer – ebp: base pointer (stack frame) • Others – eip: instruction pointer x86 instructions • Using GNU assembler syntax as example • mov Source, Destination (Different from MIPS) – mov %esp,%ebp – mov 0x804965c,%eax – mov (%eax),%edx • push Source (esp--; [esp] = Source) – push %eax – push (%ebx) • pop Destination (Destination = [esp]; esp++) – pop (%ebx) – pop %eax Arithmetic/Logical • add Src, Dest (Dest = Dest + Src) – add %ebx, %eax – add %ebx, (%eax) – add (%ebx), %eax – add $0x10, %eax – add %ebx, 0x10(%eax) – add %ebx, 0x10(%esi,%edx,4) • sub, cmp, and, or, … Flags • Flag Register – Carry – Zero – Sign – Overflow Jump • Unconditional jump: jmp – jmp address • Conditional jump: jxx (relative) – ja/jae/jg/jge – jb/jbe/jl/jle – je/jne –… • Work with cmp instruction cmp %esi, %edi jne Label Function Call • call Function – push return address == next instruction – EIP = Function • ret – pop EIP • Passing parameters – Use the stack, more later Stack Frame • All local variables are on stack. push %ebp mov %esp, %ebp sub $0x10, %esp … mov %eax, 0x8(%ebp) … mov %ebp, %esp pop %ebp ret Call a Function • Use the stack to save information mov param2, 0x4(%esp) mov param1, (%esp) call func1 OR push param2 push param1 call func1 pop eax pop ax Stack: inside a function new esp ebp old esp 0x8(ebp) … ebp return address param1 param2 System Calls • Mostly IO functions – Operating systems dependant • MIPS (or SPIM): syscall • x86: interrupts – Linux: INT 80 – DOS/Windows: INT 21