Assembly Language

advertisement
COS2014
Basic Concepts
Department of Computer Science
Faculty of Science
RU.
Assembly Language for IntelBased Computers, 5th Edition
Kip Irvine
Why study assembly language?





Learn about computers
 Computer architecture
 Operating systems
 Data representation
 Hardware devices
Learn about assembly languages
Learn about compiling
Learn how to write embedded programs
Learn the assemble language for Intel 80x86
Welcome to Assembly Language:
Definitions



Assembly language: machine-specific language with
a one-to-one correspondence with the machine
language for that computer
Machine language: The language a particular
processor understands
Assembler: converts programs from assembly
language to machine language
Welcome to Assembly Language:
Example


High-Level Language:
Assembly Language:

Machine language:
x = a + b;
MOV AX, a
ADD AX, b
MOV x, AX
A1 0002
06 0004
A3 0000
Welcome to Assembly Language:
Problems with assembly language





Provides no structure
Is not portable
Applications can be very long
“Hard” to read and understand
Lots of detail required
Assembly Language


Machine language:
 Machine instructions: direct instructions to the
processor, e.g. to be encoded to control the
datapath
 A numeric language understood by the
processor
Assembly language:
 Statements (instructions) in short mnemonics
and symbolic reference, e.g. ADD, MOV, CALL,
var1, i, j, that have a 1-to-1 relationship with
machine instructions
 Understood by human
When you are writing a C program,
how does a computer look like?
CPU, memory, I/O,
Operations on variables,
…
A Model of Computer for C
Program
Counter
i = i + j;
xfloat = 1.0;
if (A[0]==0)
…
Memory
i, j, k;
&&
CPU
if

+
for
xfloat, yfloat;
A[0], A[1], …
Typed storage
8
This model is quite different from
what the hardware in a computer
does when you run your program!
Why a C program code can
run on your computer?
Obviously, someone does some
translation for you!
We have known …
C program
x = (a+b) * b
Assembly program
C compiler
MOV
ADD
MUL
MOV
AX, a
AX, b
c
x, AX
10
We can see that Assembly
Lang. is closer to real
computer hardware!
From the angle of Assembly
Lang., how does a computer look
like?
A Model of Computer for ASM
MOV AX, a
ADD AX, b
MOV x, AX
…
CPU
PC
AX
BX
JX
Memory
a
010100110010101
b
110010110001010
x
000000000010010
...
+ -
12
Assembly program/machine code
still have some distance to the real
computer hardware.
e.g. Multi-core CPU、hyperthreading
We need one more level of
translation!
Computer Model of a Lower Layer
PCSrc
ID/EX
0
M
u
x
1
WB
Control
IF/ID
EX/MEM
M
WB
EX
M
MEM/WB
WB
Add
Add
Add result
Instruction
memory
ALUSrc
Read
register 1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
MemtoReg
Address
Branch
Shift
left 2
MemWrite
PC
Instruction
RegWrite
4
Zero
ALU
0
M
u
x
1
ALU
result
Address
Data
memory
Read
data
Write
data
Instruction 16
[15– 0]
Instruction
[20– 16]
Instruction
[15– 11]
Sign
extend
32
6
ALU
control
0
M
u
x
1
MemRead
ALUOp
RegDst
(from Computer Architecture textbook)
1
M
u
x
0
A Layered View of Computer
i = i + j;
if (A[0]==0)
…
Program
Counter
&&
CPU

if
i = i + j;
xfloat = 1.0;
if (A[0]==0)
…
i, j, k;
xfloat, yfloat;
+
A[0], A[1], …
for
Memory
ADD AX, b
MOV x, AX
…
MOV AX, a
ADD AX, b
MOV x, AX
…
a 010100110010101
b 110010110001010
AX
PC BX ...
JX
CPU
Memory
x 000000000010010
+ PCSrc
ID/EX
0
M
u
x
1
WB
Control
IF/ID
EX/MEM
M
WB
EX
M
MEM/WB
WB
Add
Add
Add result
Instruction
memory
Branch
Shift
left 2
ALUSrc
Read
register 1
Read
data 1
Read
register 2
Registers
Read
Write
data 2
register
Write
data
MemtoReg
Address
MemWrite
PC
Instruction
RegWrite
4
Zero
ALU
0
M
u
x
1
ALU
result
Address
Data
memory
Read
data
Write
data
Instruction
[15– 0]
Instruction
[20– 16]
Instruction
[15– 11]
16
Sign
extend
32
6
ALU
control
0
M
u
x
1
ALUOp
RegDst
MemRead
1
M
u
x
0
xxxxxx
xxxxx
15
From Assembly to Binary
Assembly
MOV AX, a
ADD AX, b
MUL c
MOV x, AX
Assembler
00000000101000010000000000011000
00000000100011100001100000100001
10001100011000100000000000000000
10001100111100100000000000000100
10101100111100100000000000000000
10101100011000100000000000000100
00000011111000000000000000001000
Machine code
Different Levels of Abstractions
temp = v[k];
v[k] = v[k+1];
High Level Language
Program
v[k+1] = temp;
Compiler
MOV
ADD
MOV
Assembly Language
Program
Assembler
Machine Language
Program
Machine
Interpretation
Control Signal
°
°
0000
1010
1100
0101
AX,
AX,
x,
a
b
AX
more elaborated
1001
0110 1010
later1100
in Sec.
1-2:1111
1111 0101 1000 0000 1001
Virtual
Machine
0110
1010 1111
0101 1000
1000 0000
1001 1100 0110
Concept
0101
1100
0000
1010
1000
0110
1001
1111
ALUOP[0:3] <= InstReg[9:11] & MASK
What’s Next?


Virtual machine concept (Sec. 1-2)
Data representation (Sec. 1-3)
18
Virtual Machine Concept

Purpose of this section:


Understand the role of assembly language in a
computer system
Side product:

The principle of layered abstraction for combating
complexities, e.g. OSI 7-layer protocol
19
Virtual Machine Concept



A layered abstraction of
computers proposed by
A. Tanenbaum
Each layer provides an
abstract computer, or
virtual machine, to its
upper layer
Virtual machine:

A hypothetical computer
that can be constructed
of either HW or SW
What is a computer?
High-Level Language
Level 5
Assembly Language
Level 4
Operating System
Level 3
Instruction Set
Architecture
Level 2
Microarchitecture
Level 1
Digital Logic
Level 0
20
Simplest Model of Computers
Instructions
Program
Input data
Compute
engine
Memory
Output data
c.f., y = f(x)
Layered abstraction: A computer consists of layers of
such virtual machine abstractions
21
Why Layered Abstraction?

Big idea: layered abstraction to combat
complexities





A strategy of divide-and-conquer
Decompose a complex system into layers with
well-defined interfaces
Each layer is easier to manage and handle
Only need to focus on a particular layer, e.g. to
make it the best
Also, it makes interaction clear

Particularly if one layer is realized in hardware and
the other in software
22
Layered Abstraction of Computer

Each layer as a hypothetical computer, or virtual
machine, that runs a programming language

Can be programmed with the programming
language to process inputs and outputs
Instructions
Compute
engine

Program written in Li can
be mapped to that Li-1 by:


Program
Input data
Memory
Output data
Interpretation: Li-1 program interprets and executes
Li instructions one by one
Translation: Li program is completely translated
into Li-1 program, and runs on Li-1 machine
23
Layered Abstraction of Computer
Program
Counter
&&
CPU

if
Memory
i = i + j;
xfloat = 1.0;
if (A[0]==0)
…
i, j, k;
xfloat, yfloat;
+
A[0], A[1], …
for
Memory
ADD AX, b
MOV x, AX
…
b 110010110001010
x 000000000010010
+ -
Li
MOV AX, a
ADD AX, b
MOV x, AX
…
a 010100110010101
AX
PC BX ...
JX
CPU
Virtual Machine
i = i + j;
if (A[0]==0)
…
Li-1
PCSrc
ID/EX
0
M
u
x
1
WB
Control
IF/ID
EX/MEM
M
WB
EX
M
MEM/WB
WB
Add
Add
Add result
Instruction
memory
Branch
Shift
left 2
ALUSrc
Read
register 1
Read
data 1
Read
register 2
Registers
Read
Write
data 2
register
Write
data
MemtoReg
Address
MemWrite
PC
Instruction
RegWrite
4
Zero
ALU
0
M
u
x
1
ALU
result
Address
Data
memory
Read
data
Write
data
Instruction
[15– 0]
Instruction
[20– 16]
Instruction
[15– 11]
16
Sign
extend
32
6
ALU
control
0
M
u
x
1
ALUOp
RegDst
MemRead
1
M
u
x
0
xxxxxx
xxxxx
24
Languages of Different Layers
English: Display the sum of A times B plus C.
C++: cout << (A * B + C);
Assembly Language:
mov eax,A
mul B
add eax,C
call WriteInt
Intel Machine Language:
A1 00000000
F7 25 00000004
03 05 00000008
E8 00500000
25
High-Level Language
Level 5
 Application-oriented languages, e.g., C, C++,
Java, Perl
 Written with certain programming model in mind



Variables in storage
Operators for operations
Programs compiled into assembly language
(Level 4) or interpreted by interpreters
What kind of computer does C see?
26
Assembly Language
Level 4
 Instruction mnemonics that have a one-to-one
correspondence to machine language
 Based on a view of machine: register
organization, addressing, operand types and
locations, functional units, …
High-Level Language
 Calls functions written at the
Assembly Language
OS level (Level 3)
 Programs are translated into
Operating System
machine language (Level 2)
Instruction Set
What kind of computer does it see?
Level 5
Level 4
Level 3
Architecture
Level 2
Microarchitecture
Level 1
Digital Logic
Level 0
27
Operating System
Level 3
 Provides services to Level 4 programs as if it
were a computer
 Programs translated and run at the instruction set
architecture level (Level 2)
28
Instruction Set Architecture
Level 2
 Known as conventional machine language
 Attributes of a computer as seen by assembly
programmer, i.e. conceptual structure and
functional behavior





Organization of programmable storage
Data types and data structures
Instruction set and formats
Addressing modes and data accessing
Executed by Level 1 program (microarchitecture)
29
Microarchitecture
Level 1
 Can be described by register transfer language
(RTL)
 Interprets conventional machine instructions
(Level 2)
 Executed by digital hardware (Level 0)
Register
Control Signals
Controller
clock
Memory
N
Z
IR
ALU
PC
30
Digital Logic
Level 0
 CPU, constructed from digital logic gates
 System bus
 Memory
31
What’s Next?


Virtual machine concept (Sec. 1-2)
Data representation (Sec. 1-3)
32
Data Representation

Purpose of this section







Assembly program often needs to process data,
and manage data storage and memory locations
 need to know data representation and storage
Binary numbers: translating between binary and
decimal
Binary addition
Integer storage sizes
Hexadecimal integers: translating between
decimal and hex.; hex. subtraction
Signed integers: binary subtraction
Character storage
33
Binary Numbers

Digits are 1 and 0





1 = true
0 = false
MSB – most significant bit
LSB – least significant bit
Bit numbering:
MSB
LSB
1011001010011100
15
0
Binary Numbers


Each digit (bit) is either 1 or 0
Each bit represents a power of 2:
Every binary
number is a
sum of powers
of 2
1
1
1
1
1
1
1
1
27
26
25
24
23
22
21
20
Integer Storage Sizes
Standard sizes:
byte
word
doubleword
8
16
32
quadword
64
Why unsigned
numbers?
What is the largest unsigned integer that may be stored in 20 bits?
36
Signed Integers

The highest bit indicates the sign
1 = negative, 0 = positive
sign bit
1
1
1
1
0
1
1
0
0
0
0
0
1
0
1
0
Negative
Positive
If the highest digit of a hexadecimal integer is > 7, the
value is negative. Examples: 8A, C5, A2, 9D
37
Forming Two's Complement

Negative numbers are stored in two's
complement notation


Complement (reverse) each bit
Add 1
Why?
Note that 00000001 + 11111111 = 00000000
38
Binary Addition

Starting with the LSB, add each pair of digits,
include the carry if present.
+
bit position:
carry:
1
0
0
0
0
0
1
0
0
(4)
0
0
0
0
0
1
1
1
(7)
0
0
0
0
1
0
1
1
(11)
7
6
5
4
3
2
1
0
Hexadecimal Integers
All values in memory are stored in binary. Because
long binary numbers are hard to read, we use
hexadecimal representation.
Translating Binary to Hexadecimal
• Each hexadecimal digit corresponds to 4 binary bits.
• Example: Translate the binary integer
000101101010011110010100 to hexadecimal:
Converting Hexadecimal to Decimal

Multiply each digit by its corresponding power of
16:
dec = (D3  163) + (D2  162) + (D1  161) + (D0  160)




Hex 1234 equals (1  163) + (2  162) + (3  161) + (4 
160),
or decimal 4,660.
Hex 3BA4 equals (3  163) + (11 * 162) + (10  161) + (4 
160),
or decimal 15,268.
Data representation:
Number systems (bases)

Number systems used



Binary: The internal representation inside the
computer. Externally, they may be represented in binary,
decimal, or hexadecimal.
Decimal: The system people use.
ASCII representations of numbers used for I/O:




ASCII binary
ASCII octal
ASCII decimal
ASCII hexadecimal
Data representation:
Hex Addition & Multiplication



Hex addition and multiplication tables are large
We can still do simple calculations by hand
B852h
23Ah
+ 5A65h
* 100h
(Your turn)
ABCh
+ EF0h
2B3h
* 102h
Data representation:
Converting to decimal


12345
= 1 * 104 + 2 * 103 + 3 * 102 + 4 * 101 + 5*100
(Human) conversions: hex to decimal
ABCDh = 10*163 + 11*162+ 12 *161 + 13 *160
= 10*4096 + 11*256 + 12*16 + 13
= 40960 + 2816 + 192 + 13
= 43981
Data representation:
Converting to decimal

(Human) conversions to decimal
ABCDh = (((10*16+11)*16+12)*16+13 = 43981
(easier on calculator)
Data representation:
Your Turn: Conversion problems


111010b =
________ 10
1234 base 5
or (1234)5= _________ 10
Data representation:
Conversion from decimal


(Human) conversions from decimal
274810 = ??? In hex
2748 = 171 * 16 + 12
171 = 10 * 16 + 11

10 =
0 * 16 + 10 so value is ABCh
How do we know this is the Hex representation?
2748 =
=
=
=
171 * 16 +12
(10*16 + 11) * 16 + 12
10 * 162 + 11 * 161 + 12 * 160
ABCh
Data representation:
Your Turn: Conversion problems


Write decimal 58 in binary.
Write decimal 194 in base 5
Learn How To Do the Following:





Form the two's complement of a hexadecimal
integer
Convert signed binary to decimal
Convert signed decimal to binary
Convert signed decimal to hexadecimal
Convert signed hexadecimal to decimal
Ranges of Signed Integers
The highest bit is reserved for the sign. This limits the
range:
Practice: What is the largest positive value that
may be stored in 20 bits?
Character Storage



Character sets
 Standard ASCII
(0 – 127)
 Extended ASCII (0 – 255)
 ANSI (0 – 255)
 Unicode (0 – 65,535)
Null-terminated String
 Array of characters followed by a null
byte
Using the ASCII table
 back inside cover of book
Numeric Data Representation




pure binary
 can be calculated directly
ASCII binary
 string of digits: "01010101"
ASCII decimal
 string of digits: "65"
ASCII hexadecimal
 string of digits: "9C"
Character Representation

ASCII (Table of ASCII Codes)




American Standard Code for Information Interchange
Standard encoding scheme used to represent characters
in binary format on computers
7-bit encoding, so 128 characters can be represented
0 to 31 & 127 are "control characters" (cannot print)
‒ Ctrl-A or ^A is 1, ^B is 2, etc.
‒ Used for screen formatting & data communication

32 to 126 are printable (see the last page in textbook)
ASCII Character Codes
CHAR DECIMAL
HEX
BINARY
'0'
'9'
48d
57d
30h
39h
0011 0000b
0011 1001b
'A'
'Z'
65d
90d
41h
5Ah
0100 0001b
0101 1010b
'a'
'z'
97d
122d
61h
7Ah
0110 0001b
0111 1010b
Binary Data



Decimal, hex & character representations are easier for
humans to understand; however…
All the data in the computer is binary
An int is typically 32 binary digits

int y = 5; (y = 0x00000005;)
‒ In computer y = 00000000 00000000 00000000 00000101

int z = -5; (y = 0xFFFFFFFB;)
‒ In computer, z = 11111111 11111111 11111111 11111011
Binary Data

A char is typically 8 binary digits

char x = '5'; (or char x = 53, or char x = 0x35)
‒ In computer, x = 00110101

char x = 5; (or char x = 0x05;)
‒ In computer, x = 00000101


Note that the ASCII character 5 has a different binary
value than the numeral 5
Also note that 1 ASCII character = 2 hex numbers
Storage Size Terminology

Byte


Word


16 bits (2 bytes)
Doubleword


8 bits (basic storage size for all data)
32 bits (4 bytes)
Quadword

64 bits (8 bytes)
Download