PPT - Department of Computer Science

advertisement
CS 3843 Computer Organization
Prof. Qi Tian
Fall 2013
http://www.cs.utsa.edu/~qitian/CS3843/
1
Chapter 3 Machine-Level Representations of
Programs
• 11/11/2013 (Monday)
– Section 3.7.5 Procedure
– Quiz 4
2
Chapter 3 Machine-Level Representations of
Programs
• 11/08/2013 (Friday)
– Tracking a recursive procedure Section 3.7.5
– Solution is posted under Resources.
• 11/06/2013 (Wednesday)
– Tracking a procedure Section 3.7.4
– Tracking a recursive procedure Section 3.7.5
– Reminder: Quiz on Friday Nov. 8
• 11/04/2013 (Monday)
– Section 3.7.4 Procedure
– 2nd Midterm Exam on Friday Nov. 15
3
Chapter 3 Machine-Level Representations of
Programs
• 11/01/2013 (Friday)
– Loop slides 89-96
– Questions on Assignment 4
• 10/30/2013 (Wednesday)
– Practice Problems on Conditional Flags
– Reminder: Quiz on Friday Nov. 1st
• 10/28/2013 (Monday)
– Jump Instructions slides 76-87
– Assignment 4 is due Nov. 4.
4
Chapter 3 Machine-Level Representations of
Programs
• The week of 10/21-10/25
– Replacement Lectures by Prof. Turgay Korkmaz
and TA
– Slides 52-75
5
Chapter 3 Machine-Level Representations of
Programs
• 10/18/2013 (Friday)
– Shift Operations
– Examples 4-8
– Note: Conference Travel Oct. 21-25.
• Replacement Lectures by Prof. Turgay Korkmaz and TA
• 10/16/2013 (Wednesday)
–
–
–
–
Examples 2-3
Arithmetic and Logical Operations
Practice Problems 4 and 5
Slides 33-42
• 10/14/2013 (Monday)
– Movement Instructions
– Practice Problems 2 and 3, Example 1
– Slides 26-32
6
Chapter 3 Machine-Level Representations of
Programs
• 10/11/2013 (Friday)
– Operand forms
– Practice Problem 1
• 10/09/2013 (Wednesday)
– An introduction to Assembly Code
• Read Sections 3.1-3.3
• Slides 1-16
7
Example of Assembly Codes
• Example 1 – sum.c
int add(int x, int y) {
int z;
z=x+y;
return z;
}
•
What is its assembly code?
gcc –O1 –S sum.c
• Department Machines: elk01(~08).cs.utsa.edu
8
sum.s
.file "sum.c"
.text
.globl add
.type add, @function
add:
pushl %ebp
movl %esp, %ebp
movl 12(%ebp), %eax
addl 8(%ebp), %eax
popl %ebp
ret
.size add, .-add
.ident "GCC: (Ubuntu 4.3.3-5ubuntu4) 4.3.3"
.section
.note.GNU-stack,"",@progbits
• Ignore the lines that start
with .
• The pushl and popl save and
restore %ebp
• In movl, the first argument is
the source, and the second is
the destination
• addl adds the source and
destination and stores the
results in the destination
• %eax is used to hold the
return value.
• x and y are at 8(%ebp) and
12(%ebp)
• Stack set-up and
completion
9
Assembly Code
• Highly machine specific
• Why study it?
– Being able to read and understand it is an
important skill for serious programmers.
– Shifted over the years from one of being able to
write programs directly in assembly to one of
being able to read and understand the code
generated by compilers.
10
An Introduction to Assembly Language
• IA32 – Intel Architecture 32-bits
– The dominant machine language of most computers, and x8664, its extension to run on 64-bit machine.
– All the examples in this chapter are related mainly to 32-bits
IA32 – it is our focus.
– Computers execute machine code.
• Sequences of bytes encoding low-level operations.
– Assembly Code: a textual representation for the machine code
giving the individual instructions in the program.
– Complier, e.g., gcc C complier invokes an assembler and a linker
to generate the executable machine code from the assembly
code.
• We take a close look at machine code and its humanreadable representation as assembly code.
11
ATT versus Intel Assembly-code
formats
• In our representation, we use ATT-format
– The default format for GCC, OBJDUMP, and the
other tools
• Other programming tools, including those
from Microsoft as well as the documentations
from Intel, use Intel-format.
– gcc –O1 –S –masm=intel sum.c
12
In the simplest assembly language model,
a computer consists of • A main memory
– An array of bytes
– Consecutive numbered start at 0.
• These numbers are called memory addresses.
• A program counter or PC
–
–
Hold a memory address.
Called %eip in IA32.
• A register file containing a small number of named locations.
–
Each location (register) can hold a fixed amount of information corresponding to the word size of the
machines
•
•
Typical word size is 4 bytes (32-bits machine)
%eax, %edx, %ecx, %ebx, %esi, %edi, %esp, %ebp (8 registers)
• Conditional code registers
–
–
–
Contain information about the last arithmetic or logical operation.
For example, ZF (zero flag) is set if the last operation resulted in 0.
For example, SF (sign flg) is set if the last operation yielded a negative value.
• A set of floating-point registers for holding floating-point data
13
Section 3.1 History of Intel Processor Line
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
1972: 8008 (3.5K) - first Intel microprocessor with 8-bit words.
The instruction set was designed by Datapoint Corporation which was a leading maker of
programmable CRT terminals.
Datapoint was based in San Antonio, so you might say that the Intel architecture started just a few
miles from here.
1974: 8080 (4.5K) - first successful Intel microprocessor, had some 16-bit instructions.
1978: 8086 (29K) - One of the first 16-bit microprocessors.
20-bit addresses with segmented address space.
1979: 8088 (29K) - An 8086 with an 8-bit external bus - basis of the original IBM PC
1980: 8087 (45K) - A floating point coprocessor for the 8086 and 8088, formed the bases for IEEE floating
point standard.
1982: 80286 (134K) - basis of the IBM PC-AT and MS Windows
1985: 80386 (275K) (also called i386 – expanded the architecture to 32 bits) - added flat address space,
could run Linux.
1989: 80486 (1.2M) - integrated the floating point processor
1993: Pentium (3.1M) - improved performance
1995: PentiumPro (5.5M) - new processor design
1997: Pentium 2 (7M) - more of the same
1999: Pentium 3 (8.2M) - new floating point instructions
2000: Pentium 4 (42M) - double precision floating point and many new instructions.
2004: Pentium 4E (125M) - added hyperthreading
2006: Core2 Duo (291M) - multiple cores, not hyperthreading
2008: Core i7 Quad (781M) - multiple cores and hyperthreading
2010: Itanium Tukwila (2B) - instruction-level parallelism
14
2011: Xeon Westmere (2.6B) - 10 cores
Stack
• Stack
– Some region of memory
– A data structure where values can be added or
deleted, but only according to a “last-in, first-out”
discipline
– push: add data
– pop: remove data
15
Consider the following
int sum(int x, int y) {
return x + y;
}
•
•
•
•
•
•
•
•
•
Before the function is entered, a stack is set up with the stack pointer contained in
a designated register (%esp).
The stack grows toward low memory.
The stack pointer points to the last item pushed on the stack.
The values of x and y are pushed on the stack.
The return address is also pushed on the stack.
Assume %esp is the stack pointer and all items are 4 bytes.
The return address is at 0(%esp) and the return value stored in %eax.
x is at 4(%esp).
y is at 8(%esp).
16
Machine code
• cc –c sum.s
• objdump –d sum.o which produces
--------------------------------------------------------sum.o: file format elf32-i386
Disassembly of section .text:
00000000 <add>:
0: 55
push %ebp
1: 89 e5
mov %esp,%ebp
3: 8b 45 0c
mov 0xc(%ebp),%eax
6: 03 45 08
add 0x8(%ebp),%eax
9: 5d
pop %ebp
a: c3
ret
• To inspect the contents of
machine-code files, a class
of programs known as
disasemblers can be
invaluable.
• objdump (for “object
dump”) generates a format
similar to assembly code
from the machine code.
• Each instruction takes up 1
to 15 bytes
• Common instructions such
as push, pop, or ret, are
short
17
Machine Code
• To use this program, we need a main to call it:
• e1.c
-------------------------------------------------------------------int add(int x, int y);
int main() {
int x = 12;
int y = 31;
int z;
z=add(x, y);
printf("x is %d, y is %d, and z is %d\n", x, y, z);
return 0;
}
• We do: cc –O1 –S e1.c to create: e1.s which is…
18
e1.s – Machine code
main:
leal 4(%esp), %ecx
andl $-16, %esp
pushl -4(%ecx)
pushl %ebp
movl %esp, %ebp
pushl %ecx
subl $20, %esp
movl $31, 4(%esp)
movl $12, (%esp)
call add
movl %eax, 12(%esp)
movl $31, 8(%esp)
movl $12, 4(%esp)
movl $.LC0, (%esp)
call printf
movl $0, %eax
addl $20, %esp
popl %ecx
popl %ebp
leal -4(%ecx), %esp
ret
19
Section 3.2 Program Encoding
• gcc –O1 –o sum sum.c
–
–
–
–
The -O1 is a compiler directive telling it to limit the optimizations used.
The compiler generates assembly code: sum.s
The assembler converts the assembly code into object code: sum.o
The linker combines the object code with the libraries to produce an
executable: sum
– The sum.s file is not saved by default.
• You can look at the assembly code generated using:
gcc -O1 -S sum.c
– This produces a file sum.s in ATT format.
• gcc –O1 –o sum sum.c
– This produces a file sum.s in Intel format.
• gcc –O1 –S –masm=intel sum.c
21
IA32 32-bit registers
• Eight 32-bits registers
%eax: accumulator
%ecx: counter
%edx: data
%ebx: base
%esi: source
%edi: destination
%esp: stack pointer
%ebp: frame pointer
22
Section 3.4 Access Information
• IA32 Registers
− 8 8-bit registers
− 8 16-bit registers
− 8 32-bit registers
• The first 6 32-bits
registers can be
considered general
purpose registers,
but historically they
had specific uses.
• You can modify the
8-bit registers
without modifying
the rest of the bits of
the corresponding
32-bit register.
23
Why these strange names?
• goes back to the 8080, an 8-bit machine with
registers: A, B, C, D, etc.
– The 8086 had 16-bit registers: ax, bx, cx, dx, where ax
was made up of 2 8-bit registers, al and ah.
– Similarly with bx, cx, and dx.
– The 32-bit version (80386) extended these to 32 bits,
making eax, ebx, etc.
– The low 16 bits of eax are just ax, and ax is made up
of ah and al.
• The 64-bit architecture has 128 64-bit registers
called r0 - r127.
24
Section 3.3 Data Formats for IA 32
• b Byte: 8 bits (of course)
– used for char
• w Word: 16 bits (for compatability with 16-bit architecture)
– used for short
• l Double Word: 32 bits
– used for int, long, and pointers
• s Single Precision: 32 bits
– used for float
• l Double Precision: 64 bits
– used for double
• t Extended Precision: 80 or 96 bits
– used for long double
• No direct support for long long (64-bit ints).
Operations must be done in pieces.
25
Section 3.4.1 Operand Specifiers
• There are 11 basic forms for operands.
– 1 for immediate (constant) values
– 1 for registers
– The rest are for memory.
• Three operand types:
– Immediate, is for constant values
• Written with a $ followed by an integer, e.g., $-577 or $0x17
– Register, denote the contents of one of the registers
• Its value R[Ea]
– Memory,
• Mb[Addr] to denote the b-byte value stored in memory starting at address
Addr
26
Operand Forms
• Operands can denote immediate (constant) values, register values, or values from memory.
• The scaling factor s must be either 1, 2, 4, or 8
• The general form is shown at the bottom of the table.
27
Practice Problem 1
• Assume the following values are stored at the indicated
memory addresses and registers
Address Values
Register Values
--------------------------------------------------0x100
0xFF
%eax
0x100
0x104
0xAB
%ecx
0x1
0x108
0x13
%edx
0x3
0x10C
0x11
Fill the following table:
Operand
Value
---------------------------------------------------%eax
________
0x104
________
$0x108
_________
(%eax)
_________
4(%eax)
_________
Operand
Value
--------------------------------------------------9(%eax, %edx)
__________
260(%ecx, %edx)
__________
0xFC(, %ecx, 4)
__________
(%eax, %edx, 4)
__________
28
Practice Problem 1 - Solution
• Assume the following values are stored at the indicated
memory addresses and registers
Address Values
Register Values
--------------------------------------------------0x100
0xFF
%eax
0x100
0x104
0xAB
%ecx
0x1
0x108
0x13
%edx
0x3
0x10C
0x11
Fill the following table:
Operand
Value
---------------------------------------------------%eax
_0x100___
0x104
_0xAB____
$0x108
_0x108____
(%eax)
__0xFF____
4(%eax)
__0xAB___
Operand
Value
--------------------------------------------------9(%eax, %edx)
_0x11_____
260(%ecx, %edx)
_0x13_____
0xFC(, %ecx, 4)
_0xFF_____
(%eax, %edx, 4)
_0x11_____
29
Data Movement Instructions
• MOV classes
– movb, movw, movl
– Operate on the data size of 1, 2, and 4 bytes,
respectively
– movs, movz classes
• movsbw, movsbl, movswl
– Sign-extended
• movzbw, movzbl, movzwl
– Zero-extended
30
Data Movement Instructions
Instruction
Effect
Description
MOV
movb
movw
movl
S, D
D S
Move bytes
Move words
Move double words
Move
MOVS
movsbw
movsbl
movswl
S, D
D  SignExtend(S)
Move sign-extended byte to word
Move sign-extended byte to double word
Move sign-extended word to double word
Move with sign extension
MOVZ
movzbw
movzbl
movzwl
S,D
D  ZeroExtend(S)
Move zero-extended byte to word
Move zero-extended byte to double word
Move zero-extended word to double word
Move with zero extension
pushl
S
Push double word
popl
D
R[%esp]  R[%esp]-4
M[R[%esp]]  S
D  M[R[%esp]];
R[%esp]  R[%esp]+4
Pop double word
Practice Problem 2
• Assume initially that %dh = 0xCD, %eax = 0x98765432
1.
2.
3.
movb %dh, %al
movsbl %dh, %eax
movzbl %dh, %eax
%eax =?
%eax = ?
%eax = ?
32
Practice Problem 2- Solution
• Assume initially that %dh = 0xCD, %eax = 0x98765432
1.
2.
3.
movb %dh, %al
movsbl %dh, %eax
movzbl %dh, %eax
%eax = 0x987654CD
%eax = 0xFFFFFFCD
%eax = 0x000000CD
33
Practice Problem 3
• What’s wrong with each line?
1.
2.
3.
4.
5.
6.
7.
movb $0xF, (%bl)
movl %ax, (%esp)
movw (%eax), 4(%esp)
movb %ah, %sh
movl %eax, $0x123
movl %eax, %dx
movb %si, 8(%ebp)
34
Practice Problem 3 - Solution
• What’s wrong with each line?
1. movb $0xF, (%bl)
Ans: cannot use %bl as address register
2.
movl %ax, (%esp)
Ans: mismatch between suffix with register ID
3. movw (%eax), 4(%esp)
Ans: cannot have both source and destination be memory address
4. movb %ah, %sh
Ans: no register named %sh
5. movl %eax, $0x123
Ans: Cannot have immediate as destination
6. movl %eax, %dx
Ans: Destination operand incorrect size
7.
movb %si, 8(%ebp)
Ans: Mismatch between instruction suffix with register ID.
35
Example 1
Example 1:
int simple(int x) {
return x+17;
}
Complies to:
simple:
pushl %ebp
movl %esp, %ebp
movl 8(%ebp), %eax
addl $17, %eax
popl %ebp
ret
// x into %eax
// x+17 into %eax
36
Example 2
Example 2:
int array(int* s, int i) {
return s[i];
}
Complies to:
array:
pushl %ebp
movl %esp, %ebp
movl 12(%ebp), %eax
movl 8(%ebp), %edx
movl (%edx,%eax,4), %eax
popl %ebp
ret
Question: if we changed this to an array of short, could we just change the 4 to 2?
37
Example 2
Example 2:
int array(int* s, int i) {
return s[i];
}
Complies to:
array:
pushl %ebp
movl %esp, %ebp
movl 12(%ebp), %eax
movl 8(%ebp), %edx
movl (%edx,%eax,4), %eax
popl %ebp
ret
// i into %eax
// s into %edx
// M[S+4*i] -> %eax
Question: if we changed this to an array of short, could we just change the 4 to 2?
38
Example 3
Example 3
short array(short* s, int i) {
return s[i];
}
Complies to:
array:
pushl %ebp
movl %esp, %ebp
movl 12(%ebp), %eax
movl 8(%ebp), %edx
movzwl (%edx,%eax,2), %eax
popl %ebp
ret
Questions:
1) what does the movzwl do?
2) What value would be returned in %eax if the array contained -1?
39
Example 3
Example 3
short array(short* s, int i) {
return s[i];
}
Complies to:
array:
pushl %ebp
movl %esp, %ebp
movl 12(%ebp), %eax
movl 8(%ebp), %edx
movzwl (%edx,%eax,2), %eax
popl %ebp
ret
// i into %eax
// s into %edx
// M[s+i*2] -> %eax
Questions:
1) what does the movzwl do?
2) What value would be returned in %eax if the array contained -1?
40
3.5 Arithmetic and Logical Operations
Instruction
Effect
Description
leal
S, D
D  &S
Load effective address
INC
DEC
NEG
NOT
D
D
D
D
D  D+1
D  D-1
D  -D
D  ~D
Increment
Decrement
Negate
Complement
ADD
SUB
IMUL
XOR
OR
AND
S, D
S, D
S, D
S, D
S, D
S, D
D  S+D
D  D-S
D  D*S
D  D^S
DD|S
DD&S
Add
Subtract
Multiply
Exclusive-or
Or
And
SAL
SHL
SAR
SHR
k, D
k, D
k, D
k, D
D  D<< k
D  D<< k
D  D>>A k
D  D>>L k
Left Shift
Left Shift (same as SAL)
Arithmetic right shift
Logical right shift
Figure 3.7 Integer arithmetic operations.
The load effective address (leal) instruction is commonly used to perform simple arithmetic.
The remaining ones are more standard unary or binary operations. We use the notation >>A
and >>L to denote arithmetic and logical right shift, respectively.
41
Section 3.5.2 Unary and Binary Operations
• Unary operations: inc, dec, neg, not
• Binary operations:
– Operate on source and destination, storing results
in destination
– add, sub, imul
– xor, or, and
• Bitwise operations
42
Practice Problem 4
Suppose register %eax holds value x and %ecx holds value y. Fill in the table below
with formulas indicating the value that will be stored in register %edx for each of the
given assembly code instructions:
Instruction
Result
______________________________________________________________________
leal
leal
leal
leal
leal
6(%eax), %edx
(%eax, %ecx), %edx
7(%eax, %eax, 8), %edx
0xA(, %ecx, 4), %edx
9(%eax, %ecx, 2), %edx
_____________
_____________
_____________
_____________
_____________
43
Practice Problem 4 - Solution
Suppose register %eax holds value x and %ecx holds value y. Fill in the table below
with formulas indicating the value that will be stored in register %edx for each of the
given assembly code instructions:
Instruction
Result
______________________________________________________________________
leal
leal
leal
leal
leal
6(%eax), %edx
(%eax, %ecx), %edx
7(%eax, %eax, 8), %edx
0xA(, %ecx, 4), %edx
9(%eax, %ecx, 2), %edx
____6+x______
___x+y______
___7+x+8y___
___10+4y____
___9+x+2y___
44
Practice Problem 5
• Assume the following values are stored at the indicated memory
addresses and registers
Address Values
Register Values
--------------------------------------------------0x100
0xFF
%eax
0x100
0x104
0xAB
%ecx
0x1
0x108
0x13
%edx
0x3
0x10C
0x11
Fill the following table:
Instruction
Destination
Value
________________________________________________________
addl %ecx, (%eax)
________
________
subl %edx, 4(%eax)
________
________
imul $16, (%eax, %edx, 4)
________
________
incl 8(%eax)
________
________
decl %ecx
________
________
subl %edx, %eax
________
________
45
Practice Problem 5 Solution
• Assume the following values are stored at the indicated memory
addresses and registers
Address Values
Register Values
--------------------------------------------------0x100
0xFF
%eax
0x100
0x104
0xAB
%ecx
0x1
0x108
0x13
%edx
0x3
0x10C
0x11
Fill the following table:
Instruction
Destination
Value
________________________________________________________
addl %ecx, (%eax)
_0x100__
__0x100_
subl %edx, 4(%eax)
_0x104__
__0xA8__
imul $16, (%eax, %edx, 4)
_0x10C_
__0x110__
incl 8(%eax)
_0x108__
__0x14_
decl %ecx
_%ecx__
__0x0___
subl %edx, %eax
_%eax___
_0xFD__
46
Section 3.5.3: Shift Operations
• D=[xn-1,xn-2, …, x0]
• Left Shift
– SAL, SHL are same
– D<<k = [xn-k-1,xn-k-2, …, x0, 0,0,…0]
• Dropping off the k most significant bits
• Right Shift
– SAR: arithmetic right shift
• D>>Ak = [xn-1, xn-1, …, xn-1,xn-2, …, xk]
– SHR: logical right shift
• D>>Lk = [0, 0, …, 0,xn-1,xn-2, …, xk]
• Shift Amounts
– k is encoded as a single byte, since only shift amounts between 0 and 31 are
possible (only the low-order 5 bits of the shift amounts are considered)
– Shift amount is given either as an immediate or in the single byte register
element %cl
47
Practice Problem 6
Suppose we want to generate assembly code for the following C function:
int shift_left2_rightn(int x, int n) {
x << = 2;
x >> = n;
}
The code that follows is a portion of the assembly code that performs the actual
shifts and leaves the final value in register %eax. Two key instructions have been
omitted. Parameters x and n are stored at memory locations with offsets 8 and 12,
respectively to the address in register %ebp.
1.
2.
3.
4.
movl
8(%ebp), %eax
_____________________
movl
12(%ebp), %ecx
_____________________
// get x
// x << =2
// get n
// x >> = n
48
Practice Problem 6 - Solution
Suppose we want to generate assembly code for the following C function:
int shift_left2_rightn(int x, int n) {
x << = 2;
x >> = n;
}
The code that follows is a portion of the assembly code that performs the actual
shifts and leaves the final value in register %eax. Two key instructions have been
omitted. Parameters x and n are stored at memory locations with offsets 8 and 12,
respectively to the address in register %ebp.
1.
2.
3.
4.
movl
_sall
movl
__sarl
8(%ebp), %eax
$2, %eax_____
12(%ebp), %ecx
%cl, %eax____
// get x
// x << =2
// get n
// x >> = n
49
Example 4
Example 4
void array_set(int* s, int i, int value) {
s[i]= value;
}
Compiles to:
array_set:
pushl
movl
movl
movl
movl
movl
popl
ret
%ebp
%esp, %ebp
// add comments
16(%ebp), %ecx
//
12(%ebp), %edx
//
8(%ebp), %eax
//
%ecx, (%eax,%edx,4) //
%ebp
50
Example 4
Example 4
void array_set(int* s, int i, int value) {
s[i]= value;
}
Compiles to:
array_set:
pushl
movl
movl
movl
movl
movl
popl
ret
%ebp
%esp, %ebp
16(%ebp), %ecx
12(%ebp), %edx
8(%ebp), %eax
%ecx, (%eax,%edx,4)
%ebp
// value into %ecx
// i into %edx
// s into %eax
// value into memory at (s + 4*i)
51
Example 5
Example 5: Examples 4 using short
void array_set(short* s, short i, short value) {
s[i]= value;
}
Compiles to:
array_set:
pushl %ebp
movl %esp, %ebp
// add comments here
movl 16(%ebp), %ecx
//
movl 12(%ebp), %edx
//
movl 8(%ebp), %eax
//
movw %cx, (%eax,%edx,2) //
popl %ebp
ret
Note: the use of movw and cx instead of movl and ecx
Note: 4 bytes are used to store value on the stack, even though only 2 are needed.
52
Example 5
Example 5: Examples 4 using short
void array_set(short* s, short i, short value) {
s[i]= value;
}
Compiles to:
array_set:
pushl %ebp
movl %esp, %ebp
movl 16(%ebp), %ecx
// value into %ecx
movl 12(%ebp), %edx
// i into %edx
movl 8(%ebp), %eax
// s into %eax
movw %cx, (%eax,%edx,2)
// value into memory at (s+ 2*i)
popl %ebp
ret
Note: the use of movw and cx instead of movl and ecx
Note: 4 bytes are used to store value on the stack, even though only 2 are needed.
53
Example 6
Example 6: using long long
long long array(long long* s, int i) {
return s[i];
}
Compiles to:
array:
pushl %ebp
movl %esp, %ebp
movl 12(%ebp), %edx
movl 8(%ebp), %eax
leal (%eax,%edx,8), %edx
movl (%edx), %eax
movl 4(%edx), %edx
popl %ebp
ret
// add comments here
//
//
//
//
//
54
Example 6
Example 6: using long long
long long array(long long* s, int i) {
return s[i];
}
Compiles to:
array:
pushl %ebp
movl %esp, %ebp
movl 12(%ebp), %edx
movl 8(%ebp), %eax
leal (%eax,%edx,8), %edx
movl (%edx), %eax
movl 4(%edx), %edx
popl %ebp
ret
// mov i into %edx
// s into %eax
// address of s[i] into %edx
// low 32 bits of s[i] into %edx
// high 32 bits of s[i] into %edx
// 64-bit return value in %edx, %eax
55
Example 7: Using Pointer Parameter
void exchange(int *xp, int *yp) {
int temp;
temp = *xp;
*xp = *yp;
*yp = temp;
}
Compiles to
exchange:
pushl
movl
pushl
movl
movl
movl
movl
movl
movl
popl
popl
ret
Question:
%ebp
%esp, %ebp
%ebx
8(%ebp), %edx
12(%ebp), %ecx
(%edx), %ebx
(%ecx), %eax
%eax, (%edx)
%ebx, (%ecx)
%ebx
%ebp
// add comments
//
//
//
//
//
//
//
//
It takes 4 movl instructions to do the exchange. The C source code does this in 3 moves? Why?
56
Example 7: Using Pointer Parameter
void exchange(int *xp, int *yp) {
int temp;
temp = *xp;
*xp = *yp;
*yp = temp;
}
Compiles to
exchange:
pushl
movl
pushl
movl
movl
movl
movl
movl
movl
popl
popl
ret
Question:
%ebp
%esp, %ebp
%ebx
8(%ebp), %edx
12(%ebp), %ecx
(%edx), %ebx
(%ecx), %eax
%eax, (%edx)
%ebx, (%ecx)
%ebx
%ebp
// %edx = xp
// %ecx = yp
// %ebx = *xp
// %eax = *yp
// *xp = *yp
// *yp = *xp
It takes 4 movl instructions to do the exchange. The C source code does this in 3 moves? Why?
57
Example 8: Arithmetic and Logical Operations
int arith(int x, int y, int z) {
int t1 = x + y;
int t2 = z + t1;
int t3 = x + 4;
int t4 = y * 48;
int t5 = t3 + t4;
int rval = t2 * t5;
return rval;
}
Compiles to
arith:
pushl %ebp
movl %esp, %ebp
movl 8(%ebp), %ecx
movl 12(%ebp), %edx
leal (%edx,%edx,2), %eax
sall
$4, %eax
leal 4(%ecx,%eax), %eax
addl %ecx, %edx
addl 16(%ebp), %edx
imull %edx, %eax
popl %ebp
ret
// add comments here
//
//
//
//
//
//
//
//
58
Example 8: Arithmetic and Logical Operations
int arith(int x, int y, int z) {
int t1 = x + y;
int t2 = z + t1;
int t3 = x + 4;
int t4 = y * 48;
int t5 = t3 + t4;
int rval = t2 * t5;
return rval;
}
Compiles to
arith:
pushl %ebp
movl %esp, %ebp
movl 8(%ebp), %ecx
movl 12(%ebp), %edx
leal (%edx,%edx,2), %eax
sall
$4, %eax
leal 4(%ecx,%eax), %eax
addl %ecx, %edx
addl 16(%ebp), %edx
imull %edx, %eax
popl %ebp
ret
// %ecx = x
// %edx = y
// %eax = y+2*y = 3y
// %eax = 16*3y=48y =t4
// %eax = 4+x+48y =t3+t4=t5
// %edx = x + y=t1
// %edx = z+t1 = t2
// %eax = t2*t5
59
3.5.4 Discussions
• What does the following instruction do?
xorl %eax, %eax
60
3.5.4 Discussions
• What does the following instruction do?
xorl %eax, %eax
Ans: Set %eax to zero
61
Section 3.5. Special Arithmetic Operations
• 5 special operations
– imull, mull, cltd, idivl, divl
Instruction
Effect
Description
imull
S
R[%edx]: R[%eax] S × R[%eax]
Signed full multiply
mull
S
R[%edx]: R[%eax] S × R[%eax]
Unsigned full multiply
R[%edx]: R[%eax] SignExtend(R[%eax])
Convert to quad word
R[%edx] R[%edx] : R[%eax] mod S;
Signed divide (remainder)
R[%eax] R[%edx] : R[%eax]  S;
Quotient
R[%edx] R[%edx] : R[%eax] mod S;
Unsigned divide (remainder)
R[%eax] R[%edx] : R[%eax]  S;
Quotient
cltd
idivl
divl
S
S
62
imull and mull
• Take one operand: imull S
• Multiply the operand by %eax
• The resulting 64 bits are put in %edx (high
bits) and %eax (low bits)
– Compared to imull S, D
• imull throws away the high order bits that do not fit in
the destination
• imull is for signed and mull is for unsigned
63
Example
Suppose we have signed numbers x and y stored at positions 8
and 12 relative to %ebp, and we want to store their full 64-bit
product as 8 bytes on top of the stack.
1.
2.
3.
4.
movl
imul
movl
movl
12(%ebp), %eax
8(%ebp)
%eax, (%esp)
%edx, 4(%esp)
// y into %eax
// x * y in R[%edx]:R[%eax]
// store low 32 bits
// store high 32 bits
Note: Assume little endian machine
64
idivl and divl
•
•
•
•
Take one operand: idvil S
Divide the 64-bit R[%edx]:R[%eax] by the operand
The low 32 bits of the quotient are put in %eax
The remainder is put into %edx
65
Example 9
• Suppose we have signed numbers x and y
stored at positions 8 and 12 relative to %ebp
1.
2.
3.
4.
5.
movl 8(%ebp), %eax
cltd
idivl 12(%ebp)
movl %eax, 4(%esp)
movl %edx, (%esp)
66
Example 9
• Suppose we have signed numbers x and y
stored at positions 8 and 12 relative to %ebp
1.
2.
3.
4.
5.
movl 8(%ebp), %eax // x into %eax
cltd
// sign extended into %edx
idivl 12(%ebp)
// divide x by y
movl %eax, 4(%esp) // x/y
movl %edx, (%esp)
// x % y = x mod y
67
Section 3.6 Control
• Section 3.6.1 Conditional Codes
IA32 uses four single-bit flags called conditional codes which are set by
certain instructions based on the result of the instruction.
Flag
CF
ZF
SF
OF
Name
Carry flag
Use
Carry out of the most significant bit. Used to detect overflow
for unsigned operations.
Zero flag
The most recent operation yielded zero.
Sign flag
The most recent operation yielded a negative value.
Overflow flag The most recent operation caused a two's-complement
overflow - either positive or negative. (for signed overflow)
68
Conditional Codes
• OF flag: result of add or sub has wrong sign
– addl sets the OF if both operands have the same
sign, but the result has a different sign.
– Subl A, B calculates B-A and sets OF if
B>0, A<0, and B-A <0 or
B<0, A>0, and B-A > 0
69
Conditional Codes
• CF flag: carry out of high bit
– addl sets CF if unsigned result does not fit.
e.g., 8-bit unsigned operation: 127+ 131 = 2
– For shift operations
• CF is set to be the last bit shifted out.
• sal and shl set the carry bit to the former MSB (most
significant bit)
• sar and shr set the carry bit to the former LSB (least
significant bit)
70
Conditional Codes
• The following instructions set the conditional
codes appropriately:
inc, dec, neg, not, add, sub, mul, imul, div, idiv,
xor, or, and, sal, shl, sar, shr
• The following instructions do not modify the
condition codes:
mov, leal, push, pop, call, ret, cltd
71
Comparison and Test Instructions
Instruction
CMP
S2,
cmpb
cmpw
cmpl
TEST
testb
testw
testl
S 2,
S1
S1
Based on
S1 - S2
Compare byte
Compare word
Compare double word
Description
Compare
S 1 & S2
Test byte
Test word
Test double word
Test
These do not store the resulting computation in the destination,
only the conditional codes are set.
72
Example
• Suppose we used one of the ADD instructions
to perform the equivalent of the C assignment
t=a+b, where variables a, b, and t are integers.
•
•
•
•
CF:
ZF:
SF:
OF:
(unsigned) t < (unsigned) a
Unsigned Overflow
(t == 0)
Zero
(t < 0)
Negative
(a < 0 == b < 0) && (t < 0 != a < 0) signed overflow
73
3.6.2 Accessing the Condition Codes
• You can set a byte to 0 or 1 on the condition
flags with the set instructions.
• These take a single byte operand as the
destination: either an 8-bit register or a single
byte of memory.
74
The SET instructions
signed
unsigned
Instruction
Synonym
Effect
Description
sete D
setz
D = ZF
Equal or zero
Setne D
setnz
D=~ZF
Not equal or not zero
sets D
D = SF
negative
setns D
D = ~SF
nonnegative
setg D
setnle
D  ~(SF^OF) & ~ZF
Greater (signed >)
setge D
setnl
D  ~(SF^OF)
Greater or equal (signed >=)
setl
setnge
D  SF^OF
Less (signed <)
setle D
Setng
D  (SF^OF) | ZF
Less or equal (signed <=)
seta D
setnbe
D  ~CF & ~ZF
Above (unsigned >)
setae D
setnb
D ~CF
Above or equal (unsigned >=)
setb D
setnae
D  CF
Below (unsigned <)
Setbe D
setna
D  CF|ZF
Below or equal (unsigned <=)
D
• The important part of this table is the effect field which shows how
the 4 condition codes are related to various tests.
• The description field is based on a previous instruction of the form
cmp S2, S1
negative refers to the value of S1-S2
greater, less, above, or below refer to comparing S1 to S2
75
Unsigned Comparison
• In interpreting the effect and description, consider the
instruction:
cmpl S2, S1 which calculates S1-S2
If S1 and S2 are unsigned, S1 is above S2 if the result of S1-S2
is not zero and does not set the carry flag.
D  ~CF & ~ZF
• The other three comparisons can be understood from this
one using de Morgan’s laws.
76
Signed Comparison
• The signed comparison are a bit more complicated
• Consider greater than or equal test condition.
o
o
o
o
o
o
o
o
o
Under what conditions S1 >= S2
Answer: S1 – S2 >=0
This would indicate that we just want SF =0
But recall, that sometimes the SF is incorrect.
This is indicated by the OF flag.
So if SF is correct (OF=0), we just test SF=0, or ~SF.
If SF is incorrect (OF=1), we want SF=1.
This is ~(SF^OF)
^ is exclusive or
• Other signed comparison can be gotten from >= using De
Morgan’s laws.
77
Example
• Consider the following code segment:
cmpl $10, $20
jle .L1
Does this jump?
78
Example
• Consider the following code segment:
cmpl $10, $20
jle .L1
Does this jump?
Ans: No
79
Section 3.6.3 Jump Instructions and
Their Encoding
• Jump instruction change the flow of control so that the next
instruction executed is not the next instruction.
• Traditional instruction cycle, also called fetch-and-execute cycle or
fetch-decode-execute cycle.
• The program counter (PC) register contains the
address of the next instruction to execute.
– Fetch: read the instruction whose address is in the PC
– Increment PC: increment PC so that it points to the next
instruction.
– Decode: determine what instruction this is.
– Execute: do what the instruction indicates.
– Store: store the result.
80
Section 3.6.3 Jump Instructions and
Their Encoding
Instruction
unconditional
Synonym
Jump condition
Description
jmp Label
1
Direct jump
Jmp *Operand
1
Indirect jump
je Label
jz
ZF
Equal /zero
jne Label
jnz
~ZF
Not equal / not zero
SF
Negative
~SF
Nonnegative
js
Label
jns Label
conditional
jg Label
jnle
~(SF^OF) & ~ZF
Greater (signed >)
jge Label
jnl
~(SF^OF)
Greater or equal (signed >=)
jl Label
jnge
SF^OF
Less (signed <)
jle Label
jng
(SF^OF) | ZF
Less or equal (signed <=)
ja Label
jnbe
~CF & ~ZF
Above (unsigned >)
jae Label
jnb
~CF
Above or equal (unsigned >=)
jb Label
jnae
CF
Below (unsigned <)
jbe Label
jna
CF |ZF
Below or equal (unsigned <=)
Figure. 3.12. The jump instructions. These instructions jump to a labeled destination when the
jump condition holds. Some instructions have “synonyms”, alternate names for the same machine
instructions.
81
Unconditional jump Instruction
1.
2.
3.
4.
5.
mov1 $0, %eax
jmp .L1
movl (%eax), %edx
.L1:
popl %edx
•
IA 32 unconditional jump instructions:
Two types: direct and indirect
jmp Label
jmp *Operand
jmp *%eax
jmp *(%eax)
•
// set %eax to 0
// goto .L1
// will be skipped
// use the value in register %eax as the jump target
// use the value in register %eax as the read address
Unconditional jumps are rarely used, except with conditional jumps.
82
Conditional jump Example
An example: jump.c
int simple_jump(int x, int y, int z) {
if (x == 0)
return y-z;
return z-y;
}
After cc –O1 –S jump.c, jump.s contains
simple_jump:
pushl %ebp
movl %esp, %ebp
cmpl $0, 8(%ebp)
jne .L2
movl 12(%ebp), %eax
subl 16(%ebp), %eax
jmp .L3
.L2:
movl 16(%ebp), %eax
subl 12(%ebp), %eax
.L3:
popl %ebp
ret
83
Conditional jump Example
An example: jump.c
int simple_jump(int x, int y, int z) {
if (x == 0)
return y-z;
return z-y;
}
After cc –O1 –S jump.c, jump.s contains
simple_jump:
pushl %ebp
movl %esp, %ebp
cmpl $0, 8(%ebp)
jne .L2
movl 12(%ebp), %eax
subl 16(%ebp), %eax
jmp .L3
.L2:
movl 16(%ebp), %eax
subl 12(%ebp), %eax
.L3:
popl %ebp
ret
// compare x to 0
// jmp if x ! = 0
// y into %eax
// y-z into %eax
// done
// this is the case x != 0
// get z into %eax
// z-y into %eax
// common return
84
Jump instruction encoding
•
•
There are several ways that jump instructions are encoded, the simplest of which is with PCrelative destination.
After cc –c –O1 jump.c and objdump –d jump.o, we get
00000000 <simple_jump>:
0: 55
push
1: 89 e5
mov
3: 83 7d 08 00
cmpl
7: 75 08
jne
9: 8b 45 0c
mov
c: 2b 45 10
sub
f: eb 06
jmp
11: 8b 45 10
mov
14: 2b 45 0c
sub
17: 5d
pop
18: c3
ret
%ebp
%esp,%ebp
$0x0,0x8(%ebp)
11
0xc(%ebp),%eax
0x10(%ebp),%eax
17
0x10(%ebp),%eax
0xc(%ebp),%eax
%ebp
• Labels have been replaced by the address
relative to the start of the program.
• During the executable phase of the jne
instruction at 7, the PC has the value 9
(point to the next instruction).
• The encoding of jne shows a jump offset
of 8, 9+8 = 17=0x11
• During the execution of jmp instruction
at f, the PC has value 11.
• The jump offset is 6, giving 0x11 +0x6 =
0x17
85
Practice Problem 7
In the following excerpts from a disassembled binary, some of the information has
been replaced by Xs. Answer the following questions about these instructions
A. What is the target of the je instructions below? (You don’t need to know anything about the
call instruction here.)
804828f: 74 05
je
XXXXXXX
8048291: e8 1e 00 00 00
call
80482b4
B. What is the target of the jb instruction below?
8048357: 72 e7
jb
XXXXXXX
8048359: c6 05 10 a0 04 08 01 movb $0x1, 0x804a010
C.
What is the address of the mov instruction?
XXXXXXX: 74 12
je
XXXXXXX: b8 00 00 00 00
mov
8048391
$0x0, %eax
86
Practice Problem 7 Solution
In the following excerpts from a disassembled binary, some of the information has been replaced by Xs. Answer
the following questions about these instructions
A.
What is the target of the je instructions below? (You don’t need to know anything about the call instruction here.)
804828f: 74 05
je
XXXXXXX
8048291: e8 1e 00 00 00
call
80482b4
Ans: PC = 8048291; Offset = 05; Dest = PC + Offset = 8048291 + 0x05 = 8048296
B. What is the target of the jb instruction below?
8048357: 72 e7
jb
XXXXXXX
8048359: c6 05 10 a0 04 08 01 movb $0x1, 0x804a010
Ans: PC = 8048359 Offset = e7= 1110,0111=N*=-N=-[0001,1001]=-25=-0x19
Dest = PC + Offset = 8048359 -0x19 = 8048340
C.
What is the address of the mov instruction?
XXXXXXX: 74 12
je
XXXXXXX: b8 00 00 00 00
mov
8048391
$0x0, %eax
Ans: Dest = PC + Offset => PC = Dest – Offset = 8048391 – 0x12 = 804837F
Address of Jump Instruction = 804837F – 0x2= 804837D
87
Practice Problem 8
D. In the code that follows, the jump target is encoded in PC-relative form as a 4-byte,
two’s-complement number. The bytes are listed from least significant to most,
reflecting the little-endian byte ordering of IA32. What is the address of the jump
target?
80482bf:
80482c4:
e9 e0 ff ff ff
90
jmp
nop
XXXXXXX
E. Explain the relation between the annotation on the right and the byte coding on
the left.
80482aa: ff 25 fc 9f 04 08
jmp
*0x8049ffc
88
Practice Problem 8 Solution
D. In the code that follows, the jump target is encoded in PC-relative form as a 4-byte, two’scomplement number. The bytes are listed from least significant to most, reflecting the little-endian
byte ordering of IA32. What is the address of the jump target?
80482bf:
80482c4:
e9 e0 ff ff ff
90
jmp
nop
XXXXXXX
Ans: Offset = ffff,ffe0 = -32 = -0x20; PC = 80482c4
Dest = PC + Offset = 80482c4 – 0x20 = 80482A4
E. Explain the relation between the annotation on the right and the byte coding on the left.
80482aa: ff 25 fc 9f 04 08
jmp
*0x8049ffc
Ans: An indirect jump is denoted by instruction code ff 25. The address from which the jump target
is to read is encoded explicitly by the following 4 bytes. Since the machine is little endian, these are
given in reverse order as fc 9f 04 08.
89
Section 3.6.5 Loops
• C provides several looping constructs – namely, do-while, while, and
for.
• No corresponding instructions exist in machine codes. Instead,
combinations of conditional test and jumps are used to implement the
effect of loops.
90
Section 3.6.5 Loops
Do-while loops
While loops
do
while (test-expr)
body-statement
for (init-expr; test-expr; update-expr)
body-statement
It differs from do-while in
that test-expr is evaluated
and the loops is potentially
terminating before the first
execution of body-statement.
Identical to the following code:
body-statement
while(test-expr);
•
•
The effect of the loop is to
repeatedly execute bodystatement, evaluate testexpr, and continue the loop
if the evaluation result is
nonzero.
The body-statement is
executed at least once.
For loops
init-expr;
while (test-expr) {
body-statement
update-expr;
}
91
Example 1: A do-while loop
int fact_do(int n) {
int result = 1;
do {
result *= n;
n--;
} while (n > 1);
return result;
}
And the corresponding assembly code:
fact_do:
pushl %ebp
movl %esp, %ebp
movl 8(%ebp), %edx
movl $1, %eax
.L2:
imull %edx, %eax
subl $1, %edx
cmpl $1, %edx
jg .L2
popl %ebp
ret
92
Example 1: A do-while loop
int fact_do(int n) {
int result = 1;
do {
result *= n;
n--;
} while (n > 1);
return result;
}
And the corresponding assembly code:
fact_do:
pushl %ebp
movl %esp, %ebp
movl 8(%ebp), %edx
movl $1, %eax
.L2:
imull %edx, %eax
subl $1, %edx
cmpl $1, %edx
jg .L2
popl %ebp
ret
// n into %edx
// result is in %eax, initial value =1
// result = result * n
// n--;
// compare n to 1
// jump if n > 1
93
Example 2: A while loop
C code:
int fact_while(int n) {
int result = 1;
while (n > 1) {
result *= n;
n--;
}
return result;
}
The corresponding assembly code:
fact_while:
pushl %ebp
movl %esp, %ebp
movl 8(%ebp), %edx
movl $1, %eax
cmpl $1, %edx
jle .L3
.L6:
imull %edx, %eax
subl $1, %edx
cmpl $1, %edx
jg .L6
.L3:
popl %ebp
ret
94
Example 2: A while loop
C code:
int fact_while(int n) {
int result = 1;
while (n > 1) {
result *= n;
n--;
}
return result;
}
The corresponding assembly code:
fact_while:
pushl %ebp
movl %esp, %ebp
movl 8(%ebp), %edx
movl $1, %eax
cmpl $1, %edx
jle .L3
.L6:
imull %edx, %eax
subl $1, %edx
cmpl $1, %edx
jg .L6
.L3:
popl %ebp
ret
// get n into %edx
// result in %eax
// see if n > 1
// no, we are done
// yes, result = result * n
// n--;
// compare again
// keep going if n > 1
95
Example 3: A for loop
C code:
int fact_for(int n) {
int i;
int result = 1;
for (i=2; i <=n; i++)
result *= i;
return result;
}
The corresponding assembly code:
fact_for:
pushl %ebp
movl %esp, %ebp
movl 8(%ebp), %ecx
movl $2, %edx
movl $1, %eax
cmpl $1, %ecx
jle .L3
(continue if n > =2)
.L6:
imull %edx, %eax
addl $1, %edx
cmpl %edx, %ecx
jge .L6)
.L3:
popl %ebp
ret
96
Example 3: A for loop
C code:
int fact_for(int n) {
int i;
int result = 1;
for (i=2; i <=n; i++)
result *= i;
return result;
}
The corresponding assembly code:
fact_for:
pushl %ebp
movl %esp, %ebp
movl 8(%ebp), %ecx
movl $2, %edx
movl $1, %eax
cmpl $1, %ecx
jle .L3
.L6:
imull %edx, %eax
addl $1, %edx
cmpl %edx, %ecx
jge .L6
.L3:
popl %ebp
ret
// n into %ecx
// 2 into %edx (this is i)
// 1 into %eax (the result)
// compare n to 1
// done if n <=1 (continue if n > =2)
// result = result * i
// i++
// compare n to i
// continue if n >=i (done if n < i)
97
• We will skip the sections 3.6.6 and 3.6.7.
98
Section 3.7 Procedure
• A procedure involves:
– Passing data in the form of procedure parameters
and return values
– Passing control from one part of program to
another.
– Allocate space for the local variables of the
procedure on entry and deallocate them on exit.
99
Section 3.7.1 Stack Frame Structure
•
Procedure P calls procedure Q
– P: caller
– Q: callee
•
The stack is used for passing
parameters, for local variables,
and storing other values.
•
The stack is organized into
pieces called Stack Frames.
•
Stack frame has 2 pointers
– %ebp the frame pointer
• Most information is accessed relative to
the frame pointer
– %esp, the stack pointer
• Can move
100
Section 3.7.1 Stack Frame Structure
•
The %ebp frame pointer points to the saved
%ebp register on the stack
•
Usually %ebp does not change
•
The first parameter is at 8(%ebp) because of
the return address and saved %ebp.
•
%ebp is used to address data on the caller’s
stack (such as parameters)
•
In our examples, %esp usually did not
change during execution, but in general it
will when space on the stack is allocated for
–
–
–
Saved registers
Local variables
Parameters of procedures that will be called, e.g., in
Q, it calls R
101
Section 3.7.2 Transferring Control
Three instructions used for supporting procedures
Instruction
call
Label
call
*Operand
leave
ret
Description
Procedure call – direct
Procedure call - indirect
Prepare stack for return
Return from call
• call pushes the return address (current PC) on the stack and sets the
PC to the label
— Current PC holds the return address – the address of next instruction
— direct and indirect call
• leave is equivalent to:
mov %ebp, %esp
popl %ebp
The purpose of the first of these is to restore the stack pointer to the value it had after
the initial push of %ebp
We haven’t seen leave before because none of our procedures have needed to change
%esp, so the first of these was not necessary.
• ret pops the return address and jumps to this address.
102
Practice Problem 9
•
Example Section 3.2.2 sum and main - the following are excerpts of the disassembled code for the
two functions:
1
2
3
4
5
Beginning of function sum:
08048394 <sum>:
8048394: 55
…
Return from function sum
80483a4: c3
…
call to sum from main
80483dc: e8 b3 ff ff ff
80483e1: 83 c4 14
push %ebp
ret
call 8049394 <sum>
add $0x14, %esp
Trace the registers %eip (PC) and %esp:
1) Before executing call, PC(%eip) = ________; % esp = 0xff9b960
2) When executing call, PC value (return address) is pushed into stack; %esp = _______; %eip =
________
3) After return from call, %eip = _________; %esp = __________
103
Practice Problem 9 Solution
•
Example Section 3.2.2 sum and main - the following are excerpts of the disassembled code for the
two functions:
1
2
3
4
5
Beginning of function sum:
08048394 <sum>:
8048394: 55
…
Return from function sum
80483a4: c3
…
call to sum from main
80483dc: e8 b3 ff ff ff
80483e1: 83 c4 14
push %ebp
ret
call 8049394 <sum>
add $0x14, %esp
Trace the registers %eip (PC) and %esp:
1) Before executing call, PC(%eip) = 80483dc; % esp = 0xff9b960
2) When executing call, PC value (return address) is pushed into stack; %esp = 0xff9b95c; %eip =
0x8048394
3) After return from call, %eip = 0x80483e1; %esp = 0xff9b960
104
Section 3.7.3 Register Usage
Conventions
• The set of program registers acts as a single resource
shared by all of the procedures.
• Only one procedure can be active at a given time
• caller-save registers:
– %eax, %edx, %ecx
– When Q is called by P, it can overwrite these registers without destroying any
data required by P.
• callee-save registers:
– %ebx, %esi, %edi
– Q must save the values of any of these registers on the stack before
overwriting them, and store them before returning. P might need these
values for its further computation.
– %ebp, %esp must be maintained according to the conventions described here.
105
Example
Assembly Code Sequence:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
•
•
Subl
movl
movl
movl
movl
movl
movl
movl
movl
movl
$12,
%esp
%ebx,
(%esp)
%esi,
4(%esp)
%edi,
8(%esp)
8(%ebp), %ebx
12(%ebp), %edi
(%ebx),
%esi
(%edi),
%eax
16(%ebp), %edx
(%edx),
%ecx
Three registers (%ebx, %esi, %edi) are saved on the stack (Lines 2-4). The program modifies
these and three other registers (%eax, %ecx, %edx)
At the end of the procedure, the values of registers %edi, %esi, %ebx are restored (not
shown), while the other three are left in their modified states.
106
A procedure example of Section 3.7.4
C Code:
Here is the assembly code generated:
int swap_add(int *xp, int *yp) {
int x = *xp;
int y = *yp;
*xp = y;
*yp = x;
return x + y;
}
swap_add:
pushl %ebp
movl %esp, %ebp
pushl %ebx
movl 8(%ebp), %edx
movl 12(%ebp), %ecx
movl (%edx), %ebx
movl (%ecx), %eax
movl %eax, (%edx)
movl %ebx, (%ecx)
addl %ebx, %eax
popl %ebx
popl %ebp
ret
//
//
//
//
//
//
//
//
107
A procedure example of Section 3.7.4
C Code (swap_add.c)
Here is the assembly code generated:
int swap_add(int *xp, int *yp) {
int x = *xp;
int y = *yp;
*xp = y;
*yp = x;
return x + y;
}
swap_add:
pushl %ebp
movl %esp, %ebp
pushl %ebx
movl 8(%ebp), %edx
movl 12(%ebp), %ecx
movl (%edx), %ebx
movl (%ecx), %eax
movl %eax, (%edx)
movl %ebx, (%ecx)
addl %ebx, %eax
popl %ebx
popl %ebp
ret
// xp into %edx
// yp into %ecx
// x, i.e. %ebx = *xp
// y, i.e. %eax = *yp
// *xp = *yp
// *yp = *xp
// %eax = x +y for return
108
A procedure example of Section 3.7.4
C Code (caller.c)
Here is the assembly code generated:
int caller() {
caller:
int arg1 = 534;
pushl %ebp
int arg2 = 1057;
movl %esp, %ebp
int sum = swap_add(&arg1, &arg2);
subl $24, %esp
//
int diff = arg1 - arg2;
movl $534, -4(%ebp) //
movl $1057, -8(%ebp) //
return sum*diff;
leal -8(%ebp), %eax //
}
movl %eax, 4(%esp)
leal -4(%ebp), %eax
movl %eax, (%esp)
call swap_add
.R1 movl -4(%ebp), %edx
subl -8(%ebp), %edx
imull %edx, %eax
leave
ret
//
//
//
//
//
//
//
109
A procedure example of Section 3.7.4
C Code (caller.c)
Here is the assembly code generated:
int caller() {
caller:
int arg1 = 534;
pushl %ebp
int arg2 = 1057;
movl %esp, %ebp
int sum = swap_add(&arg1, &arg2);
subl $24, %esp
//allocate 6 double words on the stack
int diff = arg1 - arg2;
movl $534, -4(%ebp) // 534 on stack
movl $1057, -8(%ebp) // 1057 on the stack
return sum*diff;
leal -8(%ebp), %eax // &1057 into %eax
}
movl %eax, 4(%esp)
leal -4(%ebp), %eax
movl %eax, (%esp)
call swap_add
.R1 movl -4(%ebp), %edx
subl -8(%ebp), %edx
imull %edx, %eax
leave
ret
// &1057 on stack
// &534 on into %eax
// &534 on stack
// arg1 into %edx
// arg1 – arg2 into %edx
// diff * return value in %eax
// restore the stack pointer
110
A procedure example of Section 3.7.4
Practice Problem 1:
Keep track of register values: %eax, %ebx, %ecx, %edx, %ebp,
%esp
111
A procedure example of Section 3.7.4
Why did the compiler reserve 6 words = 24 bytes on the stack when it only
needed 4?
Answer:
• Convention: the total number of stack bytes used by a function
should be a multiple of 16, e.g., 16, 32, 48.
• This counts the 4 bytes for the return address and the 4 bytes
for the saved %ebp
• If only 4 words were reserved, this would be 16+8=24 bytes
• To get this up to 32, we need to add 8 more bytes, or 2 more
words.
• This does not reduce the speed of execution.
• It does use a small amount of extra memory.
112
A recursive procedure example of
Section 3.7.5
C code:
Here is the assembly code generated:
rfact:
int rfact(int n) {
int result;
if (n <1 )
result =1;
else
result = n * rfact(n-1);
return result;
}
.R1
.L3:
pushl %ebp
movl %esp, %ebp
pushl %ebx
subl $4, %esp
movl 8(%ebp), %ebx
movl $1, %eax
testl %ebx, %ebx
jle .L3
leal -1(%ebx), %eax
movl %eax, (%esp)
call rfact
imull %ebx, %eax
addl $4, %esp
popl %ebx
popl %ebp
ret
113
A procedure example of Section 3.7.5
Practice Problem 2:
Keep track of register values: %eax, %ebx, %ebp, %esp
115
Download