What is Assembly Language?

advertisement
Introduction
Chapter 1
 What is Assembly Language?
 Data Representation
1
Table 1. Software Hierarchy Levels
Level
Description
Application Program
Software designed for a particular class of
applications
High-Level Language
(HLL)
Programs are compiled into either
assembly language or machine language.
E.g. C++, Pascal, Java, Visual Basic, etc.
Operating Systems
Contains procedures than can be called
from programs written in either high-level
language or assembly language. This
system may also contain an application
programming interface (API).
Assembly Language (ASM) Uses instruction mnemonics that have a
one-to-one correspondence with machine
language.
Machine Language (ML)
2
Numeric instructions and operands that can
be stored in memory and directly executed
by the computer processor.
What is Assembly Language?



3
A low-level processorspecific programming
language design to
match the processor’s
machine instruction set
each assembly
language instruction
matches exactly one
machine language
instruction
we study here Intel’s
80x86 (and Pentiums)
Why learn Assembly Language?



4
To learn how high-level language code gets
translated into machine language
 i.e.: to learn the details hidden in HLL code
To learn the computer’s hardware
 by direct access to memory, video controller,
sound card, keyboard…
To speed up applications
 direct access to hardware (ex: writing directly to
I/O ports instead of doing a system call)
 good ASM code is faster and smaller: rewrite in
ASM the critical areas of code
Assembly Language Applications



5
Application programs are rarely written completely
in assembly language
 only time-critical parts are written in ASM
 Ex: an interface subroutine (called from HLL
programs) is written in ASM for direct hardware
access
 Ex2: device drivers (called from the OS)
ASM often used for embedded systems (programs
stored in PROM chips)
 computer cartridge games, microcontrollers
(automobiles, industrial plants...),
telecommunication equipment…
Very fast and compact but processor-specific
Table 2. Comparison of Assembly
Language and High-Level Languages
Type of Applications
High-Level Language
Assembly Language
Business application
software for single
platform.
Formal structures
No formal structure.
make it easy to
organize and maintain.
Hardware device
driver.
Awkward coding
techniques required.
Hardware access is
straightforward and
simple.
Business application
for multiple platforms.
Portable.
Difficult to maintain.
Embedded systems
and computer games
requiring direct
hardware access.
Produces too much
executable code, and
may not run efficiently.
Ideal, because the
executable code is
small and runs quickly.
6
Machine Language

An assembler is a program that converts
ASM code into machine language code:
 mov
al,5 (Assembly Language)
 1011000000000101
(Machine Language)
significant byte is the opcode for “move
into register AL”
 the least significant byte is for the operand “5”
 most

7
Directly programming in machine language
offers no advantage (over Assembly)...
Binary Numbers/Storage Size


are used to store both code and data
On Intel’s x86:
 byte
= 8 bits (smallest addressable unit)
 word = 2 bytes
 doubleword = 2 words
 quadword = 2 doublewords
8
Data Representation


Even if we know that a block of memory
contains data, to obtain its value we need
to choose an interpretation
Ex: memory content “0100 0001” can
either represent:
 the
number 2^{6} + 1 = 65
 or the ASCII code of character “A”
9
Data Representation

Number Systems
 Binary/Octal/Decimal/Hexadecimal
 Converting
between various number
systems

Signed/Unsigned Interpretation
 Two’s


10
Complement
Addition/Subtraction
Character Storage
Number
Systems



11
A written number is meaningful only with respect to a
base
To tell the assembler which base we use:
 Hexadecimal 25 is written as 25h
 Octal 25 is written as 25o or 25q
 Binary 1010 is written as 1010b
 Decimal 1010 is written as 1010 or 1010d
You are supposed to know how to convert from one
base to another (see appendix A)
Binary Numbers

Digits are 1 and 0
1
= true
 0 = false


MSB – most significant bit
LSB – least significant bit
MSB

Bit numbering:
1011001010011100
15
12
LSB
0
Converting between various number
systems




13
Converting Binary to Decimal
Converting Decimal to Binary
Converting Binary to Hexadecimal
Converting Hexadecimal to Decimal
Signed and Unsigned Interpretation

When a memory block contains a number,
to obtain its value we must choose either:
 the
signed interpretation: in that case the most
significant bit (msb) represents the sign
 Positive
number (or zero) if msb = 0
 Negative number if msb = 1
 the
unsigned interpretation: in that case all the
bits are used to represent a magnitude (ie:
positive number, or zero)
14
Signed Integers
The highest bit indicates the sign. 1 =
negative,
0 = positive
sign bit
1
1
1
1
0
1
1
0
0
0
0
0
1
0
1
0
Negative
Positive
If the highest digit of a hexadecimal integer is > 7, the value is
negative. Examples: 8A, C5, A2, 9D
15
Two’s Complement Notation




16
Used to represent negative numbers
The twos complement of a positive number
X, denoted by NEG(X), is obtained by
complementing all its bits and adding +1
NEG(X) = NOT(X) + 1
 Ex: NEG(10) = NOT(10) + 1
 = NOT(0000 1010b) + 1
 = (1111 0101b) + 1 = 1111 0110b = NEG(10)
= -10
It follows that X + NEG(X) = 0
Forming the Two's Complement


Negative numbers are stored in two's
complement notation
Represents the additive Inverse
Note that 00000001 + 11111111 = 00000000
17
Binary Subtraction

To perform the difference X - Y:
 the machine executes the addition X +
NEG(Y)
00001100
– 00000011
00001100
+11111101
00001001
Practice: Subtract 0101 from 1001.
18
Maximum and Minimum Values

The msb of a signed number is used for its
sign
 fewer

bits are left for its magnitude
Ex: for a signed byte
 smallest
positive = 0000 0000b
 largest positive = 0111 1111b = 127
 largest negative = -1 = 1111 1111b
 smallest negative = 1000 0000b = -128
19
Ranges of Unsigned Integers
byte
Standard sizes:
word
doubleword
quadword
8
16
32
64
What is the largest unsigned integer that may be stored in 20 bits?
20
Ranges of Signed Integers
The highest bit is reserved for the sign. This limits the range:
Practice: What is the largest positive value that may be stored in 20 bits?
21
Signed/Unsigned Interpretation (again)


To obtain the value of a number we need to
chose an interpretation
Ex: memory content 1111 1111 can either
represent:
 -1
if a signed interpretation is used
 255 if an unsigned interpretation is used

22
Only the programmer can provide an
interpretation of the content of memory
Character Storage Systems

Character sets
(0 – 127)
 Extended ASCII (0 – 255)
 ANSI (0 – 255)
 Unicode (0 – 65,535)
 Standard ASCII

Null-terminated String
 Array
23
of characters followed by a null byte
ASCII vs Extended ASCII

The ASCII code (from 00h to 7Fh)
 Only
codes from 20h to 7Eh represent printable
characters. The rest are control codes (used
for printing, transmission…).

Extended ASCII character set (codes 80h to
FFh)
 Varies
from one system to another
 MS-DOS
usage: for accentuated characters,
Greek symbols and some graphic characters
24
The ASCII character set

CR = “carriage return” (MSDOS: move to beginning of line)
LF = “line feed” (MSDOS: move directly one line below)

SPC = “blank space”

25
Text Files


These are files containing only ASCII
characters
But different conventions are used for
indicating an “end-of line”
 MS-DOS:
<CR>+<LF>
 UNIX: <LF>
 MAC: <CR>

26
This is at the origin of many problems
encountered during transfers of text files
from one system to another
Strings and numbers



27
A strings is stored as an array of characters
A 1-byte ASCII code is stored for each char
Hence, we can either store the number 123 in
numerical form or as the string “123”
 The string form is best for display
 The numerical form is best for computations
Download