Systems Architecture Lecture 4

advertisement
Systems Architecture
Lecture 4: Compilers, Assemblers, Linkers & Loaders
Jeremy R. Johnson
Anatole D. Ruslanov
William M. Mongan
Some material drawn from CMU CSAPP Slides: Kesden and Puschel
Lec 4
Systems Architecture
1
Introduction
• Objective: To introduce the role of compilers, assemblers,
linkers and loaders. To see what is underneath a C program:
assembly language, machine language, and executable.
Lec 4
Systems Architecture
2
Compilation Process
Source
file
Assembler
Object
file
Source
file
Assembler
Object
file
Linker
Source
file
Assembler
Object
file
Program
library
Lec 4
Systems Architecture
Executable
file
3
Below Your Program
Example from a Unix system
• Source Files: count.c and main.c
• Corresponding assembly code: count.s and main.s
• Corresponding machine code (object code): count.o and
main.o
• Library functions: libc.a
• Executable file: a.out
• format for a.out and object code:
ELF (Executable and Linking Format)
Lec 4
Systems Architecture
4
Producing an Executable Program
Example from a Unix system (SGI Challenge running IRIX 6.5)
• Compiler: count.c and main.c  count.s and main.s
– gcc -S count.c main.c
• Assembler: count.s and main.s  count.o and main.o
– gcc -c count.s main.s
– as count.s -o count.o
• Linker/Loader: count.o main.o libc.a  a.out
– gcc main.o count.o
– ld main.o count.o -lc (additional libraries are required)
Lec 4
Systems Architecture
5
Source Files
void main()
{
int n,s;
printf("Enter upper limit: ");
scanf("%d",&n);
s = count(n);
printf("Sum of i from 1 to %d =
%d\n",n,s);
int count(int n)
{
int i,s;
s = 0;
for (i=1;i<=n;i++)
s = s + i;
return s;
}
}
Lec 4
Systems Architecture
6
Assembly Code for MIPS (count.s)
#.file 1 "count.c"
.option pic2
.section
.text
.text
.align 2
.globl count
.ent count
count:
.LFB1:
.frame $fp,48,$31
# vars= 16, regs= 2/0, args= 0, extra= 1
6
.mask 0x50000000,-8
.fmask 0x00000000,0
subu $sp,$sp,48
.LCFI0:
sd
$fp,40($sp)
Lec 4
Systems Architecture
7
L6:
.LCFI1:
sd
$28,32($sp)
.LCFI2:
move $fp,$sp
.LCFI3:
.set noat
lui $1,%hi(%neg(%gp_rel(count)))
addiu $1,$1,%lo(%neg(%gp_rel(count)))
daddu $gp,$1,$25
.set at
sw
$4,16($fp)
sw
$0,24($fp)
li
$2,1
# 0x1
sw
$2,20($fp)
.L3:
lw
$2,20($fp)
lw
$3,16($fp)
slt $2,$3,$2
beq $2,$0,.L6
b
.L4
Lec 4
lw
$2,24($fp)
lw
$3,20($fp)
addu $2,$2,$3
sw
$2,24($fp)
.L5:
lw
$2,20($fp)
addu $3,$2,1
sw
$3,20($fp)
b
.L3
.L4:
lw
$3,24($fp)
move $2,$3
b
.L2
.L2:
move $sp,$fp
ld
$fp,40($sp)
ld
$28,32($sp)
addu $sp,$sp,48
j
$31
.LFE1:
.end count
Systems Architecture
8
Executable Program for MIPS (a.out)
0000000 7f45 4c46 0102 0100 0000 0000 0000 0000
0000020 0002 0008 0000 0001 1000 1060 0000 0034
0000040 0000 6c94 2000 0024 0034 0020 0007 0028
0000060 0023 0022 0000 0006 0000 0034 1000 0034
0000100 1000 0034 0000 00e0 0000 00e0 0000 0004
0000120 0000 0004 0000 0003 0000 0114 1000 0114
0000140 1000 0114 0000 0015 0000 0015 0000 0004
0000160 0000 0001 7000 0002 0000 0130 1000 0130
0000200 1000 0130 0000 0080 0000 0080 0000 0004
0000220 0000 0008 7000 0000 0000 01b0 1000 01b0
0000240 1000 01b0 0000 0018 0000 0018 0000 0004
0000260 0000 0004 0000 0002 0000 01c8 1000 01c8
0000300 1000 01c8 0000 0108 0000 0108 0000 0004
0000320 0000 0004 0000 0001 0000 0000 1000 0000
0000340 1000 0000 0000 3000 0000 3000 0000 0005
•••••••••••••••••••••
Lec 4
Systems Architecture
9
Assembly Characteristics: Data Types
• “Integer” data of 1, 2, or 4 bytes
– Data values
– Addresses (untyped pointers)
• Floating point data of 4, 8, or 10 bytes
• No aggregate types such as arrays or structures
– Just contiguously allocated bytes in memory
Lec 4
Systems Architecture
10
Assembly Characteristics: Operations
• Perform arithmetic function on register or memory data
• Transfer data between memory and register
– Load data from memory into register
– Store register data into memory
• Transfer control
– Unconditional jumps to/from procedures
– Conditional branches
Lec 4
Systems Architecture
11
Object Code
Code for sum
0x401040
0x55
0x89
0xe5
0x8b
0x45
0x0c
0x03
0x45
0x08
0x89
0xec
0x5d
0xc3
Lec 4
• Assembler
– Translates .s into .o
<sum>:
– Binary encoding of each instruction
– Nearly-complete image of executable
code
– Missing linkages between code in
different files
• Linker
– Resolves references between files
– Combines with static run-time libraries
• E.g., code for malloc, printf
• Total of 13 bytes
– Some libraries are dynamically linked
• Each instruction
1, 2, or 3 bytes
• Linking occurs when program begins
execution
• Starts at address
0x401040
Systems Architecture
12
Disassembling Object Code
Disassembled
00401040 <_sum>:
0:
55
1:
89 e5
3:
8b 45 0c
6:
03 45 08
9:
89 ec
b:
5d
c:
c3
d:
8d 76 00
push
mov
mov
add
mov
pop
ret
lea
%ebp
%esp,%ebp
0xc(%ebp),%eax
0x8(%ebp),%eax
%ebp,%esp
%ebp
0x0(%esi),%esi
• Disassembler
objdump -d p
–
–
–
–
Lec 4
Useful tool for examining object code
Analyzes bit pattern of series of instructions
Produces approximate rendition of assembly code
Can be run on either a.out (complete executable) or .o file
Systems Architecture
13
Alternate Disassembly
Disassembled
Object
0x401040:
0x55
0x89
0xe5
0x8b
0x45
0x0c
0x03
0x45
0x08
0x89
0xec
0x5d
0xc3
Lec 4
0x401040
0x401041
0x401043
0x401046
0x401049
0x40104b
0x40104c
0x40104d
<sum>:
<sum+1>:
<sum+3>:
<sum+6>:
<sum+9>:
<sum+11>:
<sum+12>:
<sum+13>:
push
mov
mov
add
mov
pop
ret
lea
%ebp
%esp,%ebp
0xc(%ebp),%eax
0x8(%ebp),%eax
%ebp,%esp
%ebp
0x0(%esi),%esi
• Within gdb Debugger
gdb p
disassemble sum
– Disassemble procedure
x/13b sum
– Examine the 13 bytes starting at sum
Systems Architecture
14
What Can be Disassembled?
% objdump -d WINWORD.EXE
WINWORD.EXE:
file format pei-i386
No symbols in "WINWORD.EXE".
Disassembly of section .text:
30001000 <.text>:
30001000: 55
30001001: 8b ec
30001003: 6a ff
30001005: 68 90 10 00 30
3000100a: 68 91 dc 4c 30
push
mov
push
push
push
%ebp
%esp,%ebp
$0xffffffff
$0x30001090
$0x304cdc91
• Anything that can be interpreted as executable code
• Disassembler examines bytes and reconstructs assembly
source
Lec 4
Systems Architecture
15
Download