Uploaded by Star Bwoy

EC310-Lesson 04-Memory Mechanics inclass completed

advertisement
Lesson 04: Main Memory Mechanics
Objectives:
(a) Describe the organization and contents of main memory when a program is being executed.
(b) Demonstrate the ability to analyze a C program and identify the corresponding assembly language instructions generated.
(c) Explain and demonstrate how data is stored in memory for integers, floats and addresses (e.g. little endian).
1.
Introduction
In the previous lessons, we introduced some of the basic concepts of a C program at the high level. In this lesson, we will look closely
at the low level mechanics of the main memory. Specifically, we will
 Introduce the memory partition known as the Text Segment where instructions are stored.
 Introduce the memory partition known as the Stack where variables are stored.
 Introduce special processor’s registers known as EIP, EBP, ESP which are used to access the text segment and the stack
2.
Overview of Main Memory and CPU
Recall from a previous lesson that a high-level C program must be converted to binary machine code in order to execute on a CPU.
These machine codes are stored in a portion of memory called the Text Segment. The variables in the program are stored on the
Stack.
The x86 processor/CPU (covered in this class) has a set of registers that it uses to support data movement from/to the main memory.
The three main registers are:
eip:
esp:
ebp:
This is the most important register. This register is
known as the Instruction Pointer or the Program
Counter. This register holds the address of the next
instruction the CPU intends to execute.
Main Memory
CPU
eip
Fetch
Decode
Execute
The CPU reserves a section of memory, called the stack,
to store values that the CPU might want to retrieve later.
The esp register is used to store the address of the "top"
of the stack. The name esp stands for extended stack
pointer, but it is usually just called the stack pointer.
Text
Segment
esp
Stack
This register is called the base pointer. This CPU
register is used to point to the "bottom" of the stack. (To
be more precise, we will see later that ebp actually
points to the very first address after the bottom of the
stack.)
ebp
Fig 1. Overview of CPU and memory.
Adapted from Patterson and Hennessy, Computer Organization and Design –
the hardware/software interface, Elsevier, 5th ed, 2014.
The Central Processing Unit (CPU) of a computer uses a basic three-step cycle of:
 Fetch a program instruction from main memory,
 Decode the instruction to determine what actions to take,
 Execute the required actions for the instruction.
3.
x86 Assembly Language
Now that you have a general idea of the relationship between C language, assembly language, and machine language, let’s explore the
actual hardware and software that we will use. In this class, we mainly focus on hardware that runs the x86 instruction set, the so-called
x86 chip. This is by far the most common hardware implementation in PCs and servers. Here is a cheat sheet of common assembly
language instructions. You should refer back to it when you later encounter an assembly language instruction that is unfamiliar.
Instruction
mov
Meaning
move
Example
mov DWORD PTR [esp],0x804848a
Explanation of the example
Place the value 0x804848a in the location
specified by the address in the esp register.
cmp
compare
cmp DWORD PTR [ebp],0x4
Compare the value 4 to the value stored in the
address contained within the ebp register.
1
jne
jump if not
equal
jne 0x804839f
This instruction will always follow a comparison
(cmp). If the two items in the prior comparison
were not equal, then jump to the instruction stored
at address 0x804839f.
jle
jump if less
than or equal
jle 0x804839f
jl
jump if less
than
jge
jump if
greater than
or equal
jg
jmp
jump if
greater than
jump
This instruction will always follow a comparison
(cmp). If the first item in the prior comparison is
less or greater than the second item in the prior
comparison, then jump to the instruction stored at
address 0x804839f. For example, if the prior
comparison was cmp DWORD PTR
[ebp],0x4, then if the value stored in the
address pointed to by the ebp register is less than
or equal to 4, we would jump to the instruction
stored at address 0x804839f.
jmp 0x804839f
Jump to the instruction located at address
0x804839f.
inc
increment
inc DWORD PTR [eax]
Increment the value stored at the memory location
contained within the eax register by one.
4.
Main Memory
We will briefly discuss details of the two main portions in the main memory: the text segment and the stack.
The Text Segment: Let’s start the discussion with a simple C program below.
#include<stdio.h>
int main()
{
int x = 7;
x = 2001;
}
When this C program is compiled, machine codes are generated. Then, machine codes are loaded into memory, specifically the text
segment. Machine language instructions can vary in length. For example, the instruction at address 0x08048345 (0x89 0xe5) is two
bytes long and we know that the size of each memory location is one byte. So, this instruction uses addresses 0x08048345 and
0x08048346. Similarly, the instruction at address 0x08048354 is 7 bytes long; therefore it occupies addresses 0x08048354 to
0x0804835a.
int x = 7;
x = 2001;
(Note 0x7d1 = 2001)
The Stack: The program’s variables are stored on the stack. When an int or a float variable is declared in a c program, four bytes
are reserved on the stack between ebp and esp. Note that these variables are stored in little endian order. The little endian approach
stores the least significant byte in the first address slot, the second-least-significant byte goes in the next address, and so on.
2
For example, if we declare an integer variable var as
Variable in memory
int var = 0x12345678;
MSB
(Most Significant Byte)
LSB
(Least Significant Byte)
var
And assume that var is stored in memory starting at address
0xbffff818 as shown on the right.
Now, consider the assembly language of the instruction below.
(ebp -4)
mov DWORD PTR [ebp-4],0x00001234
This assembly language instruction means (in plain English):
Move the value 0x1234 into the address pointed to by ebp-4
(the base pointer address, minus 4 bytes). The value will occupy
4 bytes (ie. DWORD).
ebp
Address
0xbffff816
0xbffff817
0xbffff818
0xbffff819
0xbffff81a
0xbffff81b
0xbffff81c
0xBFFFF806
0xBFFFF807
0xBFFFF808
0xBFFFF809
0xBFFFF80A
0xBFFFF80B
Content
0x78
0x56
0x34
0x12
0x34
0x12
0x00
0x00
Important notes:
 In this course, storing values in memory in little-endian format ONLY applies to int and float values, and addresses. It
does NOT apply to strings, which are comprised of ASCII characters that only occupy one byte each.
 In addition, concerning arrays of int or float values, the individual int or float values are stored in little-endian
format, but the array elements are stored in order from index 0, 1, 2, etc.
 Memory can be shown as one byte per row as in previous example or one word per row as shown below.
Address
Content
0xbffff7e8
+0
0xf0
+1
0xf7
+2
0xff
+3
0xbf
0xbffff7ec
0xbffff7f0
0x39
0x4f
0x84
0x63
0x04
0x74
0x08
0x6f
0xbffff7f4
0x62
0x65
0x72
0x00
0xbffff7f8
0x18
0xf8
0xff
0xbf
Address offset
What is the address of this byte?
0xbffff7fa
Example 1: Suppose that the following variable are declared in a C program:
char initial = ‘A’;
int alpha = 291;
int grades[2] = {80, 96};
char school[5] = “Navy”;
1.
//
//
//
//
=
=
=
=
0x41
0x123 = 0x00000123
{0x50, 0x60}
{0x4E, 0x61, 0x76, 0x79, 0x00}
How many total bytes are used to store all of these variables in memory?
(1 + 4 + 2*4 + 5*1) = 18 bytes
2.
Once these variables are stored, the stack looks as follows. Complete the memory table below.
Memory Address
0xbffff806
0xbffff807
0xbffff808
0xbffff809
0xbffff80a
Data at that Memory Address (Hex)
‘N’ = 0x4E
‘a’ = 0x61
‘v’ = 0x76
‘y’= 0x79
0x00 = NULL
3
Variable Name
school
0xbffff80b
0xbffff80c
0xbffff80d
0xbffff80e
0xbffff80f
0xbffff810
0xbffff811
0xbffff812
0xbffff813
0xbffff814
0xbffff815
0xbffff816
0xbffff817
0xbffff818
0xbffff819
gar
0x50
0x00
0x00
0x00
0x60
0x00
0x00
0x00
0x23
0x01
0x00
0x00
‘A’ = 0x41
gar
grades[0]
grades[1]
alpha
initial
Example 2: The register ebp points to the "bottom" of the stack (see picture below). Upon further review of the assembly code you
determine that two strings are stored in memory, one at address ebp-40 and the other at ebp-24. (Note that the numbers 40 and
24 are ordinary base-10 numbers, not base-16.)
ebp-40
0xbffff7e0
0xbffff7e4
0xbffff7e8
0xbffff7ec
0xbffff7f0
0x85
0xf8
0xf7
0x84
0x63
= ‘c’
0x65
=’e’
0xf8
0x04
0xff
0xff
0x04
0x74
= ‘t’
0x72
= ‘r’
0xff
0x08
0xbf
0xbf
0x08
0x6f
= ‘o’
0x00
0xbffff7f8
0x02
0x00
0xf0
0x39
0x4f
= ‘O’
0x62
=’b’
0x18
0xbffff7fc
0Xf4
0x5f
0xfd
0xb7
0xbffff800
0x65
=’e’
0x00
0x6e
=’n’
0x04
0x74
=’t’
0x08
0xbffff808
0x54
=’T’
0x68
=’h’
0x35
0x07
0x00
0x00
0xbffff80c
0xf4
0x5f
0xfd
0xb7
0xbffff810
0xbffff814
0xbffff818
0xe0
0x35
0x78
0x0c
0x07
0x00
0x00
0xb8
0x00
0xbffff7f4
ebp-24
0xbffff804
ebp-16
ebp
a.
0xbffff818
0xbf
Determine the string stored at address ebp-40.
“October”
b.
Determine the string stored at address ebp-24.
“Tenth”
c.
Assume that an integer is stored at address ebp-16, what is the decimal value of this integer?
0x00000735 = 1845
4
Download