PPT - Department of Computer Science

CS 3843 Computer Organization Prof. Qi Tian Fall 2013 http://www.cs.utsa.edu/~qitian/CS3843/ 1 Chapter 3 Machine-Level Representations of Programs • 11/11/2013 (Monday) – Section 3.7.5 Procedure – Quiz 4 2 Chapter 3 Machine-Level Representations of Programs • 11/08/2013 (Friday) – Tracking a recursive procedure Section 3.7.5 – Solution is posted under Resources. • 11/06/2013 (Wednesday) – Tracking a procedure Section 3.7.4 – Tracking a recursive procedure Section 3.7.5 – Reminder: Quiz on Friday Nov. 8 • 11/04/2013 (Monday) – Section 3.7.4 Procedure – 2nd Midterm Exam on Friday Nov. 15 3 Chapter 3 Machine-Level Representations of Programs • 11/01/2013 (Friday) – Loop slides 89-96 – Questions on Assignment 4 • 10/30/2013 (Wednesday) – Practice Problems on Conditional Flags – Reminder: Quiz on Friday Nov. 1st • 10/28/2013 (Monday) – Jump Instructions slides 76-87 – Assignment 4 is due Nov. 4. 4 Chapter 3 Machine-Level Representations of Programs • The week of 10/21-10/25 – Replacement Lectures by Prof. Turgay Korkmaz and TA – Slides 52-75 5 Chapter 3 Machine-Level Representations of Programs • 10/18/2013 (Friday) – Shift Operations – Examples 4-8 – Note: Conference Travel Oct. 21-25. • Replacement Lectures by Prof. Turgay Korkmaz and TA • 10/16/2013 (Wednesday) – – – – Examples 2-3 Arithmetic and Logical Operations Practice Problems 4 and 5 Slides 33-42 • 10/14/2013 (Monday) – Movement Instructions – Practice Problems 2 and 3, Example 1 – Slides 26-32 6 Chapter 3 Machine-Level Representations of Programs • 10/11/2013 (Friday) – Operand forms – Practice Problem 1 • 10/09/2013 (Wednesday) – An introduction to Assembly Code • Read Sections 3.1-3.3 • Slides 1-16 7 Example of Assembly Codes • Example 1 – sum.c int add(int x, int y) { int z; z=x+y; return z; } • What is its assembly code? gcc –O1 –S sum.c • Department Machines: elk01(~08).cs.utsa.edu 8 sum.s .file "sum.c" .text .globl add .type add, @function add: pushl %ebp movl %esp, %ebp movl 12(%ebp), %eax addl 8(%ebp), %eax popl %ebp ret .size add, .-add .ident "GCC: (Ubuntu 4.3.3-5ubuntu4) 4.3.3" .section .note.GNU-stack,"",@progbits • Ignore the lines that start with . • The pushl and popl save and restore %ebp • In movl, the first argument is the source, and the second is the destination • addl adds the source and destination and stores the results in the destination • %eax is used to hold the return value. • x and y are at 8(%ebp) and 12(%ebp) • Stack set-up and completion 9 Assembly Code • Highly machine specific • Why study it? – Being able to read and understand it is an important skill for serious programmers. – Shifted over the years from one of being able to write programs directly in assembly to one of being able to read and understand the code generated by compilers. 10 An Introduction to Assembly Language • IA32 – Intel Architecture 32-bits – The dominant machine language of most computers, and x8664, its extension to run on 64-bit machine. – All the examples in this chapter are related mainly to 32-bits IA32 – it is our focus. – Computers execute machine code. • Sequences of bytes encoding low-level operations. – Assembly Code: a textual representation for the machine code giving the individual instructions in the program. – Complier, e.g., gcc C complier invokes an assembler and a linker to generate the executable machine code from the assembly code. • We take a close look at machine code and its humanreadable representation as assembly code. 11 ATT versus Intel Assembly-code formats • In our representation, we use ATT-format – The default format for GCC, OBJDUMP, and the other tools • Other programming tools, including those from Microsoft as well as the documentations from Intel, use Intel-format. – gcc –O1 –S –masm=intel sum.c 12 In the simplest assembly language model, a computer consists of • A main memory – An array of bytes – Consecutive numbered start at 0. • These numbers are called memory addresses. • A program counter or PC – – Hold a memory address. Called %eip in IA32. • A register file containing a small number of named locations. – Each location (register) can hold a fixed amount of information corresponding to the word size of the machines • • Typical word size is 4 bytes (32-bits machine) %eax, %edx, %ecx, %ebx, %esi, %edi, %esp, %ebp (8 registers) • Conditional code registers – – – Contain information about the last arithmetic or logical operation. For example, ZF (zero flag) is set if the last operation resulted in 0. For example, SF (sign flg) is set if the last operation yielded a negative value. • A set of floating-point registers for holding floating-point data 13 Section 3.1 History of Intel Processor Line • • • • • • • • • • • • • • • • • • 1972: 8008 (3.5K) - first Intel microprocessor with 8-bit words. The instruction set was designed by Datapoint Corporation which was a leading maker of programmable CRT terminals. Datapoint was based in San Antonio, so you might say that the Intel architecture started just a few miles from here. 1974: 8080 (4.5K) - first successful Intel microprocessor, had some 16-bit instructions. 1978: 8086 (29K) - One of the first 16-bit microprocessors. 20-bit addresses with segmented address space. 1979: 8088 (29K) - An 8086 with an 8-bit external bus - basis of the original IBM PC 1980: 8087 (45K) - A floating point coprocessor for the 8086 and 8088, formed the bases for IEEE floating point standard. 1982: 80286 (134K) - basis of the IBM PC-AT and MS Windows 1985: 80386 (275K) (also called i386 – expanded the architecture to 32 bits) - added flat address space, could run Linux. 1989: 80486 (1.2M) - integrated the floating point processor 1993: Pentium (3.1M) - improved performance 1995: PentiumPro (5.5M) - new processor design 1997: Pentium 2 (7M) - more of the same 1999: Pentium 3 (8.2M) - new floating point instructions 2000: Pentium 4 (42M) - double precision floating point and many new instructions. 2004: Pentium 4E (125M) - added hyperthreading 2006: Core2 Duo (291M) - multiple cores, not hyperthreading 2008: Core i7 Quad (781M) - multiple cores and hyperthreading 2010: Itanium Tukwila (2B) - instruction-level parallelism 14 2011: Xeon Westmere (2.6B) - 10 cores Stack • Stack – Some region of memory – A data structure where values can be added or deleted, but only according to a “last-in, first-out” discipline – push: add data – pop: remove data 15 Consider the following int sum(int x, int y) { return x + y; } • • • • • • • • • Before the function is entered, a stack is set up with the stack pointer contained in a designated register (%esp). The stack grows toward low memory. The stack pointer points to the last item pushed on the stack. The values of x and y are pushed on the stack. The return address is also pushed on the stack. Assume %esp is the stack pointer and all items are 4 bytes. The return address is at 0(%esp) and the return value stored in %eax. x is at 4(%esp). y is at 8(%esp). 16 Machine code • cc –c sum.s • objdump –d sum.o which produces --------------------------------------------------------sum.o: file format elf32-i386 Disassembly of section .text: 00000000 <add>: 0: 55 push %ebp 1: 89 e5 mov %esp,%ebp 3: 8b 45 0c mov 0xc(%ebp),%eax 6: 03 45 08 add 0x8(%ebp),%eax 9: 5d pop %ebp a: c3 ret • To inspect the contents of machine-code files, a class of programs known as disasemblers can be invaluable. • objdump (for “object dump”) generates a format similar to assembly code from the machine code. • Each instruction takes up 1 to 15 bytes • Common instructions such as push, pop, or ret, are short 17 Machine Code • To use this program, we need a main to call it: • e1.c -------------------------------------------------------------------int add(int x, int y); int main() { int x = 12; int y = 31; int z; z=add(x, y); printf("x is %d, y is %d, and z is %d\n", x, y, z); return 0; } • We do: cc –O1 –S e1.c to create: e1.s which is… 18 e1.s – Machine code main: leal 4(%esp), %ecx andl $-16, %esp pushl -4(%ecx) pushl %ebp movl %esp, %ebp pushl %ecx subl $20, %esp movl $31, 4(%esp) movl $12, (%esp) call add movl %eax, 12(%esp) movl $31, 8(%esp) movl $12, 4(%esp) movl $.LC0, (%esp) call printf movl $0, %eax addl $20, %esp popl %ecx popl %ebp leal -4(%ecx), %esp ret 19 Section 3.2 Program Encoding • gcc –O1 –o sum sum.c – – – – The -O1 is a compiler directive telling it to limit the optimizations used. The compiler generates assembly code: sum.s The assembler converts the assembly code into object code: sum.o The linker combines the object code with the libraries to produce an executable: sum – The sum.s file is not saved by default. • You can look at the assembly code generated using: gcc -O1 -S sum.c – This produces a file sum.s in ATT format. • gcc –O1 –o sum sum.c – This produces a file sum.s in Intel format. • gcc –O1 –S –masm=intel sum.c 21 IA32 32-bit registers • Eight 32-bits registers %eax: accumulator %ecx: counter %edx: data %ebx: base %esi: source %edi: destination %esp: stack pointer %ebp: frame pointer 22 Section 3.4 Access Information • IA32 Registers − 8 8-bit registers − 8 16-bit registers − 8 32-bit registers • The first 6 32-bits registers can be considered general purpose registers, but historically they had specific uses. • You can modify the 8-bit registers without modifying the rest of the bits of the corresponding 32-bit register. 23 Why these strange names? • goes back to the 8080, an 8-bit machine with registers: A, B, C, D, etc. – The 8086 had 16-bit registers: ax, bx, cx, dx, where ax was made up of 2 8-bit registers, al and ah. – Similarly with bx, cx, and dx. – The 32-bit version (80386) extended these to 32 bits, making eax, ebx, etc. – The low 16 bits of eax are just ax, and ax is made up of ah and al. • The 64-bit architecture has 128 64-bit registers called r0 - r127. 24 Section 3.3 Data Formats for IA 32 • b Byte: 8 bits (of course) – used for char • w Word: 16 bits (for compatability with 16-bit architecture) – used for short • l Double Word: 32 bits – used for int, long, and pointers • s Single Precision: 32 bits – used for float • l Double Precision: 64 bits – used for double • t Extended Precision: 80 or 96 bits – used for long double • No direct support for long long (64-bit ints). Operations must be done in pieces. 25 Section 3.4.1 Operand Specifiers • There are 11 basic forms for operands. – 1 for immediate (constant) values – 1 for registers – The rest are for memory. • Three operand types: – Immediate, is for constant values • Written with a $ followed by an integer, e.g., $-577 or $0x17 – Register, denote the contents of one of the registers • Its value R[Ea] – Memory, • Mb[Addr] to denote the b-byte value stored in memory starting at address Addr 26 Operand Forms • Operands can denote immediate (constant) values, register values, or values from memory. • The scaling factor s must be either 1, 2, 4, or 8 • The general form is shown at the bottom of the table. 27 Practice Problem 1 • Assume the following values are stored at the indicated memory addresses and registers Address Values Register Values --------------------------------------------------0x100 0xFF %eax 0x100 0x104 0xAB %ecx 0x1 0x108 0x13 %edx 0x3 0x10C 0x11 Fill the following table: Operand Value ---------------------------------------------------%eax ________ 0x104 ________ $0x108 _________ (%eax) _________ 4(%eax) _________ Operand Value --------------------------------------------------9(%eax, %edx) __________ 260(%ecx, %edx) __________ 0xFC(, %ecx, 4) __________ (%eax, %edx, 4) __________ 28 Practice Problem 1 - Solution • Assume the following values are stored at the indicated memory addresses and registers Address Values Register Values --------------------------------------------------0x100 0xFF %eax 0x100 0x104 0xAB %ecx 0x1 0x108 0x13 %edx 0x3 0x10C 0x11 Fill the following table: Operand Value ---------------------------------------------------%eax _0x100___ 0x104 _0xAB____ $0x108 _0x108____ (%eax) __0xFF____ 4(%eax) __0xAB___ Operand Value --------------------------------------------------9(%eax, %edx) _0x11_____ 260(%ecx, %edx) _0x13_____ 0xFC(, %ecx, 4) _0xFF_____ (%eax, %edx, 4) _0x11_____ 29 Data Movement Instructions • MOV classes – movb, movw, movl – Operate on the data size of 1, 2, and 4 bytes, respectively – movs, movz classes • movsbw, movsbl, movswl – Sign-extended • movzbw, movzbl, movzwl – Zero-extended 30 Data Movement Instructions Instruction Effect Description MOV movb movw movl S, D D S Move bytes Move words Move double words Move MOVS movsbw movsbl movswl S, D D  SignExtend(S) Move sign-extended byte to word Move sign-extended byte to double word Move sign-extended word to double word Move with sign extension MOVZ movzbw movzbl movzwl S,D D  ZeroExtend(S) Move zero-extended byte to word Move zero-extended byte to double word Move zero-extended word to double word Move with zero extension pushl S Push double word popl D R[%esp]  R[%esp]-4 M[R[%esp]]  S D  M[R[%esp]]; R[%esp]  R[%esp]+4 Pop double word Practice Problem 2 • Assume initially that %dh = 0xCD, %eax = 0x98765432 1. 2. 3. movb %dh, %al movsbl %dh, %eax movzbl %dh, %eax %eax =? %eax = ? %eax = ? 32 Practice Problem 2- Solution • Assume initially that %dh = 0xCD, %eax = 0x98765432 1. 2. 3. movb %dh, %al movsbl %dh, %eax movzbl %dh, %eax %eax = 0x987654CD %eax = 0xFFFFFFCD %eax = 0x000000CD 33 Practice Problem 3 • What’s wrong with each line? 1. 2. 3. 4. 5. 6. 7. movb $0xF, (%bl) movl %ax, (%esp) movw (%eax), 4(%esp) movb %ah, %sh movl %eax, $0x123 movl %eax, %dx movb %si, 8(%ebp) 34 Practice Problem 3 - Solution • What’s wrong with each line? 1. movb $0xF, (%bl) Ans: cannot use %bl as address register 2. movl %ax, (%esp) Ans: mismatch between suffix with register ID 3. movw (%eax), 4(%esp) Ans: cannot have both source and destination be memory address 4. movb %ah, %sh Ans: no register named %sh 5. movl %eax, $0x123 Ans: Cannot have immediate as destination 6. movl %eax, %dx Ans: Destination operand incorrect size 7. movb %si, 8(%ebp) Ans: Mismatch between instruction suffix with register ID. 35 Example 1 Example 1: int simple(int x) { return x+17; } Complies to: simple: pushl %ebp movl %esp, %ebp movl 8(%ebp), %eax addl $17, %eax popl %ebp ret // x into %eax // x+17 into %eax 36 Example 2 Example 2: int array(int* s, int i) { return s[i]; } Complies to: array: pushl %ebp movl %esp, %ebp movl 12(%ebp), %eax movl 8(%ebp), %edx movl (%edx,%eax,4), %eax popl %ebp ret Question: if we changed this to an array of short, could we just change the 4 to 2? 37 Example 2 Example 2: int array(int* s, int i) { return s[i]; } Complies to: array: pushl %ebp movl %esp, %ebp movl 12(%ebp), %eax movl 8(%ebp), %edx movl (%edx,%eax,4), %eax popl %ebp ret // i into %eax // s into %edx // M[S+4*i] -> %eax Question: if we changed this to an array of short, could we just change the 4 to 2? 38 Example 3 Example 3 short array(short* s, int i) { return s[i]; } Complies to: array: pushl %ebp movl %esp, %ebp movl 12(%ebp), %eax movl 8(%ebp), %edx movzwl (%edx,%eax,2), %eax popl %ebp ret Questions: 1) what does the movzwl do? 2) What value would be returned in %eax if the array contained -1? 39 Example 3 Example 3 short array(short* s, int i) { return s[i]; } Complies to: array: pushl %ebp movl %esp, %ebp movl 12(%ebp), %eax movl 8(%ebp), %edx movzwl (%edx,%eax,2), %eax popl %ebp ret // i into %eax // s into %edx // M[s+i*2] -> %eax Questions: 1) what does the movzwl do? 2) What value would be returned in %eax if the array contained -1? 40 3.5 Arithmetic and Logical Operations Instruction Effect Description leal S, D D  &S Load effective address INC DEC NEG NOT D D D D D  D+1 D  D-1 D  -D D  ~D Increment Decrement Negate Complement ADD SUB IMUL XOR OR AND S, D S, D S, D S, D S, D S, D D  S+D D  D-S D  D*S D  D^S DD|S DD&S Add Subtract Multiply Exclusive-or Or And SAL SHL SAR SHR k, D k, D k, D k, D D  D<< k D  D<< k D  D>>A k D  D>>L k Left Shift Left Shift (same as SAL) Arithmetic right shift Logical right shift Figure 3.7 Integer arithmetic operations. The load effective address (leal) instruction is commonly used to perform simple arithmetic. The remaining ones are more standard unary or binary operations. We use the notation >>A and >>L to denote arithmetic and logical right shift, respectively. 41 Section 3.5.2 Unary and Binary Operations • Unary operations: inc, dec, neg, not • Binary operations: – Operate on source and destination, storing results in destination – add, sub, imul – xor, or, and • Bitwise operations 42 Practice Problem 4 Suppose register %eax holds value x and %ecx holds value y. Fill in the table below with formulas indicating the value that will be stored in register %edx for each of the given assembly code instructions: Instruction Result ______________________________________________________________________ leal leal leal leal leal 6(%eax), %edx (%eax, %ecx), %edx 7(%eax, %eax, 8), %edx 0xA(, %ecx, 4), %edx 9(%eax, %ecx, 2), %edx _____________ _____________ _____________ _____________ _____________ 43 Practice Problem 4 - Solution Suppose register %eax holds value x and %ecx holds value y. Fill in the table below with formulas indicating the value that will be stored in register %edx for each of the given assembly code instructions: Instruction Result ______________________________________________________________________ leal leal leal leal leal 6(%eax), %edx (%eax, %ecx), %edx 7(%eax, %eax, 8), %edx 0xA(, %ecx, 4), %edx 9(%eax, %ecx, 2), %edx ____6+x______ ___x+y______ ___7+x+8y___ ___10+4y____ ___9+x+2y___ 44 Practice Problem 5 • Assume the following values are stored at the indicated memory addresses and registers Address Values Register Values --------------------------------------------------0x100 0xFF %eax 0x100 0x104 0xAB %ecx 0x1 0x108 0x13 %edx 0x3 0x10C 0x11 Fill the following table: Instruction Destination Value ________________________________________________________ addl %ecx, (%eax) ________ ________ subl %edx, 4(%eax) ________ ________ imul $16, (%eax, %edx, 4) ________ ________ incl 8(%eax) ________ ________ decl %ecx ________ ________ subl %edx, %eax ________ ________ 45 Practice Problem 5 Solution • Assume the following values are stored at the indicated memory addresses and registers Address Values Register Values --------------------------------------------------0x100 0xFF %eax 0x100 0x104 0xAB %ecx 0x1 0x108 0x13 %edx 0x3 0x10C 0x11 Fill the following table: Instruction Destination Value ________________________________________________________ addl %ecx, (%eax) _0x100__ __0x100_ subl %edx, 4(%eax) _0x104__ __0xA8__ imul $16, (%eax, %edx, 4) _0x10C_ __0x110__ incl 8(%eax) _0x108__ __0x14_ decl %ecx _%ecx__ __0x0___ subl %edx, %eax _%eax___ _0xFD__ 46 Section 3.5.3: Shift Operations • D=[xn-1,xn-2, …, x0] • Left Shift – SAL, SHL are same – D<<k = [xn-k-1,xn-k-2, …, x0, 0,0,…0] • Dropping off the k most significant bits • Right Shift – SAR: arithmetic right shift • D>>Ak = [xn-1, xn-1, …, xn-1,xn-2, …, xk] – SHR: logical right shift • D>>Lk = [0, 0, …, 0,xn-1,xn-2, …, xk] • Shift Amounts – k is encoded as a single byte, since only shift amounts between 0 and 31 are possible (only the low-order 5 bits of the shift amounts are considered) – Shift amount is given either as an immediate or in the single byte register element %cl 47 Practice Problem 6 Suppose we want to generate assembly code for the following C function: int shift_left2_rightn(int x, int n) { x << = 2; x >> = n; } The code that follows is a portion of the assembly code that performs the actual shifts and leaves the final value in register %eax. Two key instructions have been omitted. Parameters x and n are stored at memory locations with offsets 8 and 12, respectively to the address in register %ebp. 1. 2. 3. 4. movl 8(%ebp), %eax _____________________ movl 12(%ebp), %ecx _____________________ // get x // x << =2 // get n // x >> = n 48 Practice Problem 6 - Solution Suppose we want to generate assembly code for the following C function: int shift_left2_rightn(int x, int n) { x << = 2; x >> = n; } The code that follows is a portion of the assembly code that performs the actual shifts and leaves the final value in register %eax. Two key instructions have been omitted. Parameters x and n are stored at memory locations with offsets 8 and 12, respectively to the address in register %ebp. 1. 2. 3. 4. movl _sall movl __sarl 8(%ebp), %eax $2, %eax_____ 12(%ebp), %ecx %cl, %eax____ // get x // x << =2 // get n // x >> = n 49 Example 4 Example 4 void array_set(int* s, int i, int value) { s[i]= value; } Compiles to: array_set: pushl movl movl movl movl movl popl ret %ebp %esp, %ebp // add comments 16(%ebp), %ecx // 12(%ebp), %edx // 8(%ebp), %eax // %ecx, (%eax,%edx,4) // %ebp 50 Example 4 Example 4 void array_set(int* s, int i, int value) { s[i]= value; } Compiles to: array_set: pushl movl movl movl movl movl popl ret %ebp %esp, %ebp 16(%ebp), %ecx 12(%ebp), %edx 8(%ebp), %eax %ecx, (%eax,%edx,4) %ebp // value into %ecx // i into %edx // s into %eax // value into memory at (s + 4*i) 51 Example 5 Example 5: Examples 4 using short void array_set(short* s, short i, short value) { s[i]= value; } Compiles to: array_set: pushl %ebp movl %esp, %ebp // add comments here movl 16(%ebp), %ecx // movl 12(%ebp), %edx // movl 8(%ebp), %eax // movw %cx, (%eax,%edx,2) // popl %ebp ret Note: the use of movw and cx instead of movl and ecx Note: 4 bytes are used to store value on the stack, even though only 2 are needed. 52 Example 5 Example 5: Examples 4 using short void array_set(short* s, short i, short value) { s[i]= value; } Compiles to: array_set: pushl %ebp movl %esp, %ebp movl 16(%ebp), %ecx // value into %ecx movl 12(%ebp), %edx // i into %edx movl 8(%ebp), %eax // s into %eax movw %cx, (%eax,%edx,2) // value into memory at (s+ 2*i) popl %ebp ret Note: the use of movw and cx instead of movl and ecx Note: 4 bytes are used to store value on the stack, even though only 2 are needed. 53 Example 6 Example 6: using long long long long array(long long* s, int i) { return s[i]; } Compiles to: array: pushl %ebp movl %esp, %ebp movl 12(%ebp), %edx movl 8(%ebp), %eax leal (%eax,%edx,8), %edx movl (%edx), %eax movl 4(%edx), %edx popl %ebp ret // add comments here // // // // // 54 Example 6 Example 6: using long long long long array(long long* s, int i) { return s[i]; } Compiles to: array: pushl %ebp movl %esp, %ebp movl 12(%ebp), %edx movl 8(%ebp), %eax leal (%eax,%edx,8), %edx movl (%edx), %eax movl 4(%edx), %edx popl %ebp ret // mov i into %edx // s into %eax // address of s[i] into %edx // low 32 bits of s[i] into %edx // high 32 bits of s[i] into %edx // 64-bit return value in %edx, %eax 55 Example 7: Using Pointer Parameter void exchange(int *xp, int *yp) { int temp; temp = *xp; *xp = *yp; *yp = temp; } Compiles to exchange: pushl movl pushl movl movl movl movl movl movl popl popl ret Question: %ebp %esp, %ebp %ebx 8(%ebp), %edx 12(%ebp), %ecx (%edx), %ebx (%ecx), %eax %eax, (%edx) %ebx, (%ecx) %ebx %ebp // add comments // // // // // // // // It takes 4 movl instructions to do the exchange. The C source code does this in 3 moves? Why? 56 Example 7: Using Pointer Parameter void exchange(int *xp, int *yp) { int temp; temp = *xp; *xp = *yp; *yp = temp; } Compiles to exchange: pushl movl pushl movl movl movl movl movl movl popl popl ret Question: %ebp %esp, %ebp %ebx 8(%ebp), %edx 12(%ebp), %ecx (%edx), %ebx (%ecx), %eax %eax, (%edx) %ebx, (%ecx) %ebx %ebp // %edx = xp // %ecx = yp // %ebx = *xp // %eax = *yp // *xp = *yp // *yp = *xp It takes 4 movl instructions to do the exchange. The C source code does this in 3 moves? Why? 57 Example 8: Arithmetic and Logical Operations int arith(int x, int y, int z) { int t1 = x + y; int t2 = z + t1; int t3 = x + 4; int t4 = y * 48; int t5 = t3 + t4; int rval = t2 * t5; return rval; } Compiles to arith: pushl %ebp movl %esp, %ebp movl 8(%ebp), %ecx movl 12(%ebp), %edx leal (%edx,%edx,2), %eax sall $4, %eax leal 4(%ecx,%eax), %eax addl %ecx, %edx addl 16(%ebp), %edx imull %edx, %eax popl %ebp ret // add comments here // // // // // // // // 58 Example 8: Arithmetic and Logical Operations int arith(int x, int y, int z) { int t1 = x + y; int t2 = z + t1; int t3 = x + 4; int t4 = y * 48; int t5 = t3 + t4; int rval = t2 * t5; return rval; } Compiles to arith: pushl %ebp movl %esp, %ebp movl 8(%ebp), %ecx movl 12(%ebp), %edx leal (%edx,%edx,2), %eax sall $4, %eax leal 4(%ecx,%eax), %eax addl %ecx, %edx addl 16(%ebp), %edx imull %edx, %eax popl %ebp ret // %ecx = x // %edx = y // %eax = y+2*y = 3y // %eax = 16*3y=48y =t4 // %eax = 4+x+48y =t3+t4=t5 // %edx = x + y=t1 // %edx = z+t1 = t2 // %eax = t2*t5 59 3.5.4 Discussions • What does the following instruction do? xorl %eax, %eax 60 3.5.4 Discussions • What does the following instruction do? xorl %eax, %eax Ans: Set %eax to zero 61 Section 3.5. Special Arithmetic Operations • 5 special operations – imull, mull, cltd, idivl, divl Instruction Effect Description imull S R[%edx]: R[%eax] S × R[%eax] Signed full multiply mull S R[%edx]: R[%eax] S × R[%eax] Unsigned full multiply R[%edx]: R[%eax] SignExtend(R[%eax]) Convert to quad word R[%edx] R[%edx] : R[%eax] mod S; Signed divide (remainder) R[%eax] R[%edx] : R[%eax]  S; Quotient R[%edx] R[%edx] : R[%eax] mod S; Unsigned divide (remainder) R[%eax] R[%edx] : R[%eax]  S; Quotient cltd idivl divl S S 62 imull and mull • Take one operand: imull S • Multiply the operand by %eax • The resulting 64 bits are put in %edx (high bits) and %eax (low bits) – Compared to imull S, D • imull throws away the high order bits that do not fit in the destination • imull is for signed and mull is for unsigned 63 Example Suppose we have signed numbers x and y stored at positions 8 and 12 relative to %ebp, and we want to store their full 64-bit product as 8 bytes on top of the stack. 1. 2. 3. 4. movl imul movl movl 12(%ebp), %eax 8(%ebp) %eax, (%esp) %edx, 4(%esp) // y into %eax // x * y in R[%edx]:R[%eax] // store low 32 bits // store high 32 bits Note: Assume little endian machine 64 idivl and divl • • • • Take one operand: idvil S Divide the 64-bit R[%edx]:R[%eax] by the operand The low 32 bits of the quotient are put in %eax The remainder is put into %edx 65 Example 9 • Suppose we have signed numbers x and y stored at positions 8 and 12 relative to %ebp 1. 2. 3. 4. 5. movl 8(%ebp), %eax cltd idivl 12(%ebp) movl %eax, 4(%esp) movl %edx, (%esp) 66 Example 9 • Suppose we have signed numbers x and y stored at positions 8 and 12 relative to %ebp 1. 2. 3. 4. 5. movl 8(%ebp), %eax // x into %eax cltd // sign extended into %edx idivl 12(%ebp) // divide x by y movl %eax, 4(%esp) // x/y movl %edx, (%esp) // x % y = x mod y 67 Section 3.6 Control • Section 3.6.1 Conditional Codes IA32 uses four single-bit flags called conditional codes which are set by certain instructions based on the result of the instruction. Flag CF ZF SF OF Name Carry flag Use Carry out of the most significant bit. Used to detect overflow for unsigned operations. Zero flag The most recent operation yielded zero. Sign flag The most recent operation yielded a negative value. Overflow flag The most recent operation caused a two's-complement overflow - either positive or negative. (for signed overflow) 68 Conditional Codes • OF flag: result of add or sub has wrong sign – addl sets the OF if both operands have the same sign, but the result has a different sign. – Subl A, B calculates B-A and sets OF if B>0, A<0, and B-A <0 or B<0, A>0, and B-A > 0 69 Conditional Codes • CF flag: carry out of high bit – addl sets CF if unsigned result does not fit. e.g., 8-bit unsigned operation: 127+ 131 = 2 – For shift operations • CF is set to be the last bit shifted out. • sal and shl set the carry bit to the former MSB (most significant bit) • sar and shr set the carry bit to the former LSB (least significant bit) 70 Conditional Codes • The following instructions set the conditional codes appropriately: inc, dec, neg, not, add, sub, mul, imul, div, idiv, xor, or, and, sal, shl, sar, shr • The following instructions do not modify the condition codes: mov, leal, push, pop, call, ret, cltd 71 Comparison and Test Instructions Instruction CMP S2, cmpb cmpw cmpl TEST testb testw testl S 2, S1 S1 Based on S1 - S2 Compare byte Compare word Compare double word Description Compare S 1 & S2 Test byte Test word Test double word Test These do not store the resulting computation in the destination, only the conditional codes are set. 72 Example • Suppose we used one of the ADD instructions to perform the equivalent of the C assignment t=a+b, where variables a, b, and t are integers. • • • • CF: ZF: SF: OF: (unsigned) t < (unsigned) a Unsigned Overflow (t == 0) Zero (t < 0) Negative (a < 0 == b < 0) && (t < 0 != a < 0) signed overflow 73 3.6.2 Accessing the Condition Codes • You can set a byte to 0 or 1 on the condition flags with the set instructions. • These take a single byte operand as the destination: either an 8-bit register or a single byte of memory. 74 The SET instructions signed unsigned Instruction Synonym Effect Description sete D setz D = ZF Equal or zero Setne D setnz D=~ZF Not equal or not zero sets D D = SF negative setns D D = ~SF nonnegative setg D setnle D  ~(SFÔF) & ~ZF Greater (signed >) setge D setnl D  ~(SFÔF) Greater or equal (signed >=) setl setnge D  SFÔF Less (signed <) setle D Setng D  (SFÔF) | ZF Less or equal (signed <=) seta D setnbe D  ~CF & ~ZF Above (unsigned >) setae D setnb D ~CF Above or equal (unsigned >=) setb D setnae D  CF Below (unsigned <) Setbe D setna D  CF|ZF Below or equal (unsigned <=) D • The important part of this table is the effect field which shows how the 4 condition codes are related to various tests. • The description field is based on a previous instruction of the form cmp S2, S1 negative refers to the value of S1-S2 greater, less, above, or below refer to comparing S1 to S2 75 Unsigned Comparison • In interpreting the effect and description, consider the instruction: cmpl S2, S1 which calculates S1-S2 If S1 and S2 are unsigned, S1 is above S2 if the result of S1-S2 is not zero and does not set the carry flag. D  ~CF & ~ZF • The other three comparisons can be understood from this one using de Morgan’s laws. 76 Signed Comparison • The signed comparison are a bit more complicated • Consider greater than or equal test condition. o o o o o o o o o Under what conditions S1 >= S2 Answer: S1 – S2 >=0 This would indicate that we just want SF =0 But recall, that sometimes the SF is incorrect. This is indicated by the OF flag. So if SF is correct (OF=0), we just test SF=0, or ~SF. If SF is incorrect (OF=1), we want SF=1. This is ~(SFÔF) ^ is exclusive or • Other signed comparison can be gotten from >= using De Morgan’s laws. 77 Example • Consider the following code segment: cmpl $10, $20 jle .L1 Does this jump? 78 Example • Consider the following code segment: cmpl $10, $20 jle .L1 Does this jump? Ans: No 79 Section 3.6.3 Jump Instructions and Their Encoding • Jump instruction change the flow of control so that the next instruction executed is not the next instruction. • Traditional instruction cycle, also called fetch-and-execute cycle or fetch-decode-execute cycle. • The program counter (PC) register contains the address of the next instruction to execute. – Fetch: read the instruction whose address is in the PC – Increment PC: increment PC so that it points to the next instruction. – Decode: determine what instruction this is. – Execute: do what the instruction indicates. – Store: store the result. 80 Section 3.6.3 Jump Instructions and Their Encoding Instruction unconditional Synonym Jump condition Description jmp Label 1 Direct jump Jmp *Operand 1 Indirect jump je Label jz ZF Equal /zero jne Label jnz ~ZF Not equal / not zero SF Negative ~SF Nonnegative js Label jns Label conditional jg Label jnle ~(SFÔF) & ~ZF Greater (signed >) jge Label jnl ~(SFÔF) Greater or equal (signed >=) jl Label jnge SFÔF Less (signed <) jle Label jng (SFÔF) | ZF Less or equal (signed <=) ja Label jnbe ~CF & ~ZF Above (unsigned >) jae Label jnb ~CF Above or equal (unsigned >=) jb Label jnae CF Below (unsigned <) jbe Label jna CF |ZF Below or equal (unsigned <=) Figure. 3.12. The jump instructions. These instructions jump to a labeled destination when the jump condition holds. Some instructions have “synonyms”, alternate names for the same machine instructions. 81 Unconditional jump Instruction 1. 2. 3. 4. 5. mov1 $0, %eax jmp .L1 movl (%eax), %edx .L1: popl %edx • IA 32 unconditional jump instructions: Two types: direct and indirect jmp Label jmp *Operand jmp *%eax jmp *(%eax) • // set %eax to 0 // goto .L1 // will be skipped // use the value in register %eax as the jump target // use the value in register %eax as the read address Unconditional jumps are rarely used, except with conditional jumps. 82 Conditional jump Example An example: jump.c int simple_jump(int x, int y, int z) { if (x == 0) return y-z; return z-y; } After cc –O1 –S jump.c, jump.s contains simple_jump: pushl %ebp movl %esp, %ebp cmpl $0, 8(%ebp) jne .L2 movl 12(%ebp), %eax subl 16(%ebp), %eax jmp .L3 .L2: movl 16(%ebp), %eax subl 12(%ebp), %eax .L3: popl %ebp ret 83 Conditional jump Example An example: jump.c int simple_jump(int x, int y, int z) { if (x == 0) return y-z; return z-y; } After cc –O1 –S jump.c, jump.s contains simple_jump: pushl %ebp movl %esp, %ebp cmpl $0, 8(%ebp) jne .L2 movl 12(%ebp), %eax subl 16(%ebp), %eax jmp .L3 .L2: movl 16(%ebp), %eax subl 12(%ebp), %eax .L3: popl %ebp ret // compare x to 0 // jmp if x ! = 0 // y into %eax // y-z into %eax // done // this is the case x != 0 // get z into %eax // z-y into %eax // common return 84 Jump instruction encoding • • There are several ways that jump instructions are encoded, the simplest of which is with PCrelative destination. After cc –c –O1 jump.c and objdump –d jump.o, we get 00000000 <simple_jump>: 0: 55 push 1: 89 e5 mov 3: 83 7d 08 00 cmpl 7: 75 08 jne 9: 8b 45 0c mov c: 2b 45 10 sub f: eb 06 jmp 11: 8b 45 10 mov 14: 2b 45 0c sub 17: 5d pop 18: c3 ret %ebp %esp,%ebp $0x0,0x8(%ebp) 11 0xc(%ebp),%eax 0x10(%ebp),%eax 17 0x10(%ebp),%eax 0xc(%ebp),%eax %ebp • Labels have been replaced by the address relative to the start of the program. • During the executable phase of the jne instruction at 7, the PC has the value 9 (point to the next instruction). • The encoding of jne shows a jump offset of 8, 9+8 = 17=0x11 • During the execution of jmp instruction at f, the PC has value 11. • The jump offset is 6, giving 0x11 +0x6 = 0x17 85 Practice Problem 7 In the following excerpts from a disassembled binary, some of the information has been replaced by Xs. Answer the following questions about these instructions A. What is the target of the je instructions below? (You don’t need to know anything about the call instruction here.) 804828f: 74 05 je XXXXXXX 8048291: e8 1e 00 00 00 call 80482b4 B. What is the target of the jb instruction below? 8048357: 72 e7 jb XXXXXXX 8048359: c6 05 10 a0 04 08 01 movb $0x1, 0x804a010 C. What is the address of the mov instruction? XXXXXXX: 74 12 je XXXXXXX: b8 00 00 00 00 mov 8048391 $0x0, %eax 86 Practice Problem 7 Solution In the following excerpts from a disassembled binary, some of the information has been replaced by Xs. Answer the following questions about these instructions A. What is the target of the je instructions below? (You don’t need to know anything about the call instruction here.) 804828f: 74 05 je XXXXXXX 8048291: e8 1e 00 00 00 call 80482b4 Ans: PC = 8048291; Offset = 05; Dest = PC + Offset = 8048291 + 0x05 = 8048296 B. What is the target of the jb instruction below? 8048357: 72 e7 jb XXXXXXX 8048359: c6 05 10 a0 04 08 01 movb $0x1, 0x804a010 Ans: PC = 8048359 Offset = e7= 1110,0111=N*=-N=-[0001,1001]=-25=-0x19 Dest = PC + Offset = 8048359 -0x19 = 8048340 C. What is the address of the mov instruction? XXXXXXX: 74 12 je XXXXXXX: b8 00 00 00 00 mov 8048391 $0x0, %eax Ans: Dest = PC + Offset => PC = Dest – Offset = 8048391 – 0x12 = 804837F Address of Jump Instruction = 804837F – 0x2= 804837D 87 Practice Problem 8 D. In the code that follows, the jump target is encoded in PC-relative form as a 4-byte, two’s-complement number. The bytes are listed from least significant to most, reflecting the little-endian byte ordering of IA32. What is the address of the jump target? 80482bf: 80482c4: e9 e0 ff ff ff 90 jmp nop XXXXXXX E. Explain the relation between the annotation on the right and the byte coding on the left. 80482aa: ff 25 fc 9f 04 08 jmp *0x8049ffc 88 Practice Problem 8 Solution D. In the code that follows, the jump target is encoded in PC-relative form as a 4-byte, two’scomplement number. The bytes are listed from least significant to most, reflecting the little-endian byte ordering of IA32. What is the address of the jump target? 80482bf: 80482c4: e9 e0 ff ff ff 90 jmp nop XXXXXXX Ans: Offset = ffff,ffe0 = -32 = -0x20; PC = 80482c4 Dest = PC + Offset = 80482c4 – 0x20 = 80482A4 E. Explain the relation between the annotation on the right and the byte coding on the left. 80482aa: ff 25 fc 9f 04 08 jmp *0x8049ffc Ans: An indirect jump is denoted by instruction code ff 25. The address from which the jump target is to read is encoded explicitly by the following 4 bytes. Since the machine is little endian, these are given in reverse order as fc 9f 04 08. 89 Section 3.6.5 Loops • C provides several looping constructs – namely, do-while, while, and for. • No corresponding instructions exist in machine codes. Instead, combinations of conditional test and jumps are used to implement the effect of loops. 90 Section 3.6.5 Loops Do-while loops While loops do while (test-expr) body-statement for (init-expr; test-expr; update-expr) body-statement It differs from do-while in that test-expr is evaluated and the loops is potentially terminating before the first execution of body-statement. Identical to the following code: body-statement while(test-expr); • • The effect of the loop is to repeatedly execute bodystatement, evaluate testexpr, and continue the loop if the evaluation result is nonzero. The body-statement is executed at least once. For loops init-expr; while (test-expr) { body-statement update-expr; } 91 Example 1: A do-while loop int fact_do(int n) { int result = 1; do { result *= n; n--; } while (n > 1); return result; } And the corresponding assembly code: fact_do: pushl %ebp movl %esp, %ebp movl 8(%ebp), %edx movl $1, %eax .L2: imull %edx, %eax subl $1, %edx cmpl $1, %edx jg .L2 popl %ebp ret 92 Example 1: A do-while loop int fact_do(int n) { int result = 1; do { result *= n; n--; } while (n > 1); return result; } And the corresponding assembly code: fact_do: pushl %ebp movl %esp, %ebp movl 8(%ebp), %edx movl $1, %eax .L2: imull %edx, %eax subl $1, %edx cmpl $1, %edx jg .L2 popl %ebp ret // n into %edx // result is in %eax, initial value =1 // result = result * n // n--; // compare n to 1 // jump if n > 1 93 Example 2: A while loop C code: int fact_while(int n) { int result = 1; while (n > 1) { result *= n; n--; } return result; } The corresponding assembly code: fact_while: pushl %ebp movl %esp, %ebp movl 8(%ebp), %edx movl $1, %eax cmpl $1, %edx jle .L3 .L6: imull %edx, %eax subl $1, %edx cmpl $1, %edx jg .L6 .L3: popl %ebp ret 94 Example 2: A while loop C code: int fact_while(int n) { int result = 1; while (n > 1) { result *= n; n--; } return result; } The corresponding assembly code: fact_while: pushl %ebp movl %esp, %ebp movl 8(%ebp), %edx movl $1, %eax cmpl $1, %edx jle .L3 .L6: imull %edx, %eax subl $1, %edx cmpl $1, %edx jg .L6 .L3: popl %ebp ret // get n into %edx // result in %eax // see if n > 1 // no, we are done // yes, result = result * n // n--; // compare again // keep going if n > 1 95 Example 3: A for loop C code: int fact_for(int n) { int i; int result = 1; for (i=2; i <=n; i++) result *= i; return result; } The corresponding assembly code: fact_for: pushl %ebp movl %esp, %ebp movl 8(%ebp), %ecx movl $2, %edx movl $1, %eax cmpl $1, %ecx jle .L3 (continue if n > =2) .L6: imull %edx, %eax addl $1, %edx cmpl %edx, %ecx jge .L6) .L3: popl %ebp ret 96 Example 3: A for loop C code: int fact_for(int n) { int i; int result = 1; for (i=2; i <=n; i++) result *= i; return result; } The corresponding assembly code: fact_for: pushl %ebp movl %esp, %ebp movl 8(%ebp), %ecx movl $2, %edx movl $1, %eax cmpl $1, %ecx jle .L3 .L6: imull %edx, %eax addl $1, %edx cmpl %edx, %ecx jge .L6 .L3: popl %ebp ret // n into %ecx // 2 into %edx (this is i) // 1 into %eax (the result) // compare n to 1 // done if n <=1 (continue if n > =2) // result = result * i // i++ // compare n to i // continue if n >=i (done if n < i) 97 • We will skip the sections 3.6.6 and 3.6.7. 98 Section 3.7 Procedure • A procedure involves: – Passing data in the form of procedure parameters and return values – Passing control from one part of program to another. – Allocate space for the local variables of the procedure on entry and deallocate them on exit. 99 Section 3.7.1 Stack Frame Structure • Procedure P calls procedure Q – P: caller – Q: callee • The stack is used for passing parameters, for local variables, and storing other values. • The stack is organized into pieces called Stack Frames. • Stack frame has 2 pointers – %ebp the frame pointer • Most information is accessed relative to the frame pointer – %esp, the stack pointer • Can move 100 Section 3.7.1 Stack Frame Structure • The %ebp frame pointer points to the saved %ebp register on the stack • Usually %ebp does not change • The first parameter is at 8(%ebp) because of the return address and saved %ebp. • %ebp is used to address data on the caller’s stack (such as parameters) • In our examples, %esp usually did not change during execution, but in general it will when space on the stack is allocated for – – – Saved registers Local variables Parameters of procedures that will be called, e.g., in Q, it calls R 101 Section 3.7.2 Transferring Control Three instructions used for supporting procedures Instruction call Label call *Operand leave ret Description Procedure call – direct Procedure call - indirect Prepare stack for return Return from call • call pushes the return address (current PC) on the stack and sets the PC to the label — Current PC holds the return address – the address of next instruction — direct and indirect call • leave is equivalent to: mov %ebp, %esp popl %ebp The purpose of the first of these is to restore the stack pointer to the value it had after the initial push of %ebp We haven’t seen leave before because none of our procedures have needed to change %esp, so the first of these was not necessary. • ret pops the return address and jumps to this address. 102 Practice Problem 9 • Example Section 3.2.2 sum and main - the following are excerpts of the disassembled code for the two functions: 1 2 3 4 5 Beginning of function sum: 08048394 <sum>: 8048394: 55 … Return from function sum 80483a4: c3 … call to sum from main 80483dc: e8 b3 ff ff ff 80483e1: 83 c4 14 push %ebp ret call 8049394 <sum> add $0x14, %esp Trace the registers %eip (PC) and %esp: 1) Before executing call, PC(%eip) = ________; % esp = 0xff9b960 2) When executing call, PC value (return address) is pushed into stack; %esp = _______; %eip = ________ 3) After return from call, %eip = _________; %esp = __________ 103 Practice Problem 9 Solution • Example Section 3.2.2 sum and main - the following are excerpts of the disassembled code for the two functions: 1 2 3 4 5 Beginning of function sum: 08048394 <sum>: 8048394: 55 … Return from function sum 80483a4: c3 … call to sum from main 80483dc: e8 b3 ff ff ff 80483e1: 83 c4 14 push %ebp ret call 8049394 <sum> add $0x14, %esp Trace the registers %eip (PC) and %esp: 1) Before executing call, PC(%eip) = 80483dc; % esp = 0xff9b960 2) When executing call, PC value (return address) is pushed into stack; %esp = 0xff9b95c; %eip = 0x8048394 3) After return from call, %eip = 0x80483e1; %esp = 0xff9b960 104 Section 3.7.3 Register Usage Conventions • The set of program registers acts as a single resource shared by all of the procedures. • Only one procedure can be active at a given time • caller-save registers: – %eax, %edx, %ecx – When Q is called by P, it can overwrite these registers without destroying any data required by P. • callee-save registers: – %ebx, %esi, %edi – Q must save the values of any of these registers on the stack before overwriting them, and store them before returning. P might need these values for its further computation. – %ebp, %esp must be maintained according to the conventions described here. 105 Example Assembly Code Sequence: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. • • Subl movl movl movl movl movl movl movl movl movl $12, %esp %ebx, (%esp) %esi, 4(%esp) %edi, 8(%esp) 8(%ebp), %ebx 12(%ebp), %edi (%ebx), %esi (%edi), %eax 16(%ebp), %edx (%edx), %ecx Three registers (%ebx, %esi, %edi) are saved on the stack (Lines 2-4). The program modifies these and three other registers (%eax, %ecx, %edx) At the end of the procedure, the values of registers %edi, %esi, %ebx are restored (not shown), while the other three are left in their modified states. 106 A procedure example of Section 3.7.4 C Code: Here is the assembly code generated: int swap_add(int *xp, int *yp) { int x = *xp; int y = *yp; *xp = y; *yp = x; return x + y; } swap_add: pushl %ebp movl %esp, %ebp pushl %ebx movl 8(%ebp), %edx movl 12(%ebp), %ecx movl (%edx), %ebx movl (%ecx), %eax movl %eax, (%edx) movl %ebx, (%ecx) addl %ebx, %eax popl %ebx popl %ebp ret // // // // // // // // 107 A procedure example of Section 3.7.4 C Code (swap_add.c) Here is the assembly code generated: int swap_add(int *xp, int *yp) { int x = *xp; int y = *yp; *xp = y; *yp = x; return x + y; } swap_add: pushl %ebp movl %esp, %ebp pushl %ebx movl 8(%ebp), %edx movl 12(%ebp), %ecx movl (%edx), %ebx movl (%ecx), %eax movl %eax, (%edx) movl %ebx, (%ecx) addl %ebx, %eax popl %ebx popl %ebp ret // xp into %edx // yp into %ecx // x, i.e. %ebx = *xp // y, i.e. %eax = *yp // *xp = *yp // *yp = *xp // %eax = x +y for return 108 A procedure example of Section 3.7.4 C Code (caller.c) Here is the assembly code generated: int caller() { caller: int arg1 = 534; pushl %ebp int arg2 = 1057; movl %esp, %ebp int sum = swap_add(&arg1, &arg2); subl $24, %esp // int diff = arg1 - arg2; movl $534, -4(%ebp) // movl $1057, -8(%ebp) // return sum*diff; leal -8(%ebp), %eax // } movl %eax, 4(%esp) leal -4(%ebp), %eax movl %eax, (%esp) call swap_add .R1 movl -4(%ebp), %edx subl -8(%ebp), %edx imull %edx, %eax leave ret // // // // // // // 109 A procedure example of Section 3.7.4 C Code (caller.c) Here is the assembly code generated: int caller() { caller: int arg1 = 534; pushl %ebp int arg2 = 1057; movl %esp, %ebp int sum = swap_add(&arg1, &arg2); subl $24, %esp //allocate 6 double words on the stack int diff = arg1 - arg2; movl $534, -4(%ebp) // 534 on stack movl $1057, -8(%ebp) // 1057 on the stack return sum*diff; leal -8(%ebp), %eax // &1057 into %eax } movl %eax, 4(%esp) leal -4(%ebp), %eax movl %eax, (%esp) call swap_add .R1 movl -4(%ebp), %edx subl -8(%ebp), %edx imull %edx, %eax leave ret // &1057 on stack // &534 on into %eax // &534 on stack // arg1 into %edx // arg1 – arg2 into %edx // diff * return value in %eax // restore the stack pointer 110 A procedure example of Section 3.7.4 Practice Problem 1: Keep track of register values: %eax, %ebx, %ecx, %edx, %ebp, %esp 111 A procedure example of Section 3.7.4 Why did the compiler reserve 6 words = 24 bytes on the stack when it only needed 4? Answer: • Convention: the total number of stack bytes used by a function should be a multiple of 16, e.g., 16, 32, 48. • This counts the 4 bytes for the return address and the 4 bytes for the saved %ebp • If only 4 words were reserved, this would be 16+8=24 bytes • To get this up to 32, we need to add 8 more bytes, or 2 more words. • This does not reduce the speed of execution. • It does use a small amount of extra memory. 112 A recursive procedure example of Section 3.7.5 C code: Here is the assembly code generated: rfact: int rfact(int n) { int result; if (n <1 ) result =1; else result = n * rfact(n-1); return result; } .R1 .L3: pushl %ebp movl %esp, %ebp pushl %ebx subl $4, %esp movl 8(%ebp), %ebx movl $1, %eax testl %ebx, %ebx jle .L3 leal -1(%ebx), %eax movl %eax, (%esp) call rfact imull %ebx, %eax addl $4, %esp popl %ebx popl %ebp ret 113 A procedure example of Section 3.7.5 Practice Problem 2: Keep track of register values: %eax, %ebx, %ebp, %esp 115

PPT - Department of Computer Science

Related documents

Products

Support

PPT - Department of Computer Science

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib