assLect5

advertisement
Lecture № 5
Syntax of Assembly
1. Format of instructions and macroinstructions.
2. Syntax of operands in Assembly.
3. Syntax of operators in Assembly.
Literature.
1. Jurov V. Assembler, – SPb.: Piter, 2001. – 624 p.
2. Pustovarov V. I. Assembler. Programming and analysis
of machinery programs correctness, - Kiev: “Irina”, 2000.
- 476
3.Tanenbaum, A.S. Structured Computer Organization, 4th
ed. - Upper Saddle River, NJ : Prentice Hall, 2002.
Syntax of Assembly
Sentences, which are included in any program, may
represent a syntax construction of one of the types:
instruction, macroinstructions, directive or comment.
There are concrete regulations, according which every
syntax construction is formed. The formats’ diagrams
below illustrate these regulations:
Format of instructions and macroinstructions.
OC
Operand 1
Name
Of
Label
(Label)
,
:
;
Operand 2
Comments
Here : Name of Label (Label) is an identifier, which meaning is an address
of the first byte of the sentence; OC (operational code) is a mnemonic
designation of the corresponding machinery instruction or macroinstructions;
Operands are parts of instruction or macroinstructions, which are subjected
to some actions.
Format of Directives.
Directive
Operand 1
,
Name
;
Operand n
Comments
Here : Directive is a mnemonic designation of the translator’s directive
Name is an identifier, with help of which the translator distinguishes similar directives
The following symbols are permissible in a text of
assembly program:
- all Latin letters: A-Z, a-z (capital and small letters are
considered as equivalent);
- digits from 0 to 9;
- signs (characters) ?, @, $, _, &;
- delimiters (separators): , . [ ] ( ) , . { } + / * % ! “ “ ? \ =
# ^.
Assembly sentences are formed by lexical units
(tokens) [лексемы], which are not separable sequences
of permissible language symbols (they have sense only for
the translator).
The lexical units are:
- Identifiers are sequences of permissible symbols,
which used for designation such program’s objects
as: codes of operations, names of variables and
labels names. There is a regulation of spelling
(writing) identifiers: Identifier may include one or
more symbols, but not more than 255. Symbols may
be letters of Latin alphabet and some special signs:
__, ?, $, @. It can not(!) begin with a digit.
- Symbols chains are sequences of symbols
enclosed in inverted commas (single or double);
- Integers are presented in: binary, decimal or
hexadecimal calculation system.
Operands.
Let’s consider classification
supported by assembly translator:
of
some
operands,
Constant or direct operands: number, string, name
or expression, which have a fixed meaning. The name
must be not removable (i.e. it mustn’t depend on an
address of program loading into the memory (for
example, it may be defined by operators equ or = .
Address operands. These operands set physical
location of operand in the memory with help of pointing
to address components: segment and offset.
Syntax of Address Operands description.
CS
DS
:
Integer
Absolute
Name
SS
ES
GS
fs
Segment’s
Name
Name of
Group
Absolute
Expression
Removable operands are any symbolic names,
which represent some memory addresses. These
addresses may designate a place inside the memory of
an instruction (if operand is a label), or data (if operand
is a name of the memory area inside a data segment).
These operands are not fastened to a concrete address
of the physical memory. The segment component of the
a removable operand address is not known and will be
determined only after loading of the program into the
memory for execution. For example:
data segment
mas_w dw 25 dup (0)
…..
code segment
…..
lea SI, mas_w; mas_w is a removable operand
In this fragment mas_w is a removable operand, the
meaning of which is the initial address of the memory
area of 25 words volume. The full physical address of this
memory area will be known after the program loading.
Address counter is a specific type of operand. It
is designated as $. The specificity of this register
consists in following: when the translator meats this
symbol in a program, it puts instead of it the current
contents of the address counter.
Register operand. This is simply a name of one
of ALU’s registers.
In common operands may be components of more
complex formations, which are called expressions.
Expressions are combinations of operands and
operators.
As in high-level languages an execution of assembly
operators is also fulfilled during expressions calculation in
accordance with their priorities (operations with equal
priorities are executed sequentially from left to right, the
change of the order is possible by using round brackets
(parentheses), which have the highest priority).
Example of operators and their priorities.
Operators
Priority
length, size, width, mask, (, ), [, ], <, >
1
.
2
.
.
.
.
ptr, offset, seg, type, this
4
high, low
5
+, - (unary)
6
*, /, mod, shl, shr
7
+, - (binary)
8
eq, ne, lt, le, gt, ge
9
Let’s give a short characteristic of operators:
 Arithmetic operators. The next operators belong to this
type: “+”, “-“ (unary and binary); “*”, “/”, “mod”.
Syntax of Arithmetic Operators
Expression_1
+
+
Expression_2
+
*
/
MOD
D+
+
-
Example:
tab_size equ 68 ; Volume of an array in bytes
size_el equ 4 ; size of elements
…..
;a number of elements in the array is determined
;and is inputted into CX register
mov CX,tab_size/size_el ; operator”/”
 Shift operators execute shift of an expression on
pointed number of digits (positions).
Example:
Mask_b equ 10011000
…..
mov AL,mask_b shr 3 ;operator “shr”
Syntax of Shift Operators
Expression
shr
Number of
shifted positions
shl
 Comparison operators (return meaning of “truth”,
“false”), intended for logical expressions formation. Logical
meaning “truth” corresponds to logical 1 (unity), and
“false” corresponds to 0.
Example:
tab_size equ 30; a size of the table
….
mov AL,tab_size gt 50; loading of the table size
;in register AL
cmp AL,0; if tab_size <50, then
je m4
; jump on m4
….
m4: …………………………
 Index operator [ ]. The translator understands this
operator as an indication to add the meaning of the first
expression (which is out of [ ]) with the meaning of the
second expression.
Example:
mov AX,mas[SI]; transfer of word with an address
;mas+(SI) into the register AX
 Operator of redeclaration (redefinition) of the ptr
type. It is used for redeclaration or for making more
precise the type of label or variable, which have been
determined by the expression. The type may have one of
the following meanings: byte, word, dword, qword, tbyte,
near, far.
Example:
d_wrd dd 0
......
mov AL,byte ptr d_wrd+1; transfer of the second byte
;from the double word
 Operator of segment redeclaration ’:’ (colon). The
translator understands it as an indication to calculate a
physical address in correspondence with the given
segment component: “name of the segment register”,
“name of the segment” from the directive SEGMENT or
“Group name”.
It is important to keep in mind, that the code segment
can not (!) be redeclarated. This may be explained by the
role of code segment in the sequenced program execution:
for
the execution of the next in turn program the
microprocessor must first of all “look through” the contents
of code segment register (namely in this register the
address of the base (beginning) of the code segment is
contained). In order to calculate an address of the
necessary instruction, the microprocessor multiplies the
contents of the CS by 16 (it means to fulfill a shift on 4
positions to the left) and after it , the microprocessor sums
the obtained 20 bits product with 16 bits contents of the IP.
Approximately the same is executed for operands
processing, namely: if the microprocessor understands,
that the operand is an address (the efficient address of
which is only a part of the physical address), then it knows,
in which of the segments it may be located (as a rule, it is
fixed in the register DS). If data addresses (or data) are
stored in a segment of stack, then we will deal with
registers SP and BP (where the necessary addresses are
stored, as a rule).If such types of addresses are “as a rule”
stored in these segments, it means, that they may be
stored in other segments, and it is possible to choose
where it will be more convenient to locate them. For this
purpose the redeclaration operator serves. It is used as a
prefix , which a bit corrects the work of an instruction. The
prefix is included in the not compulsory field of the
machinery instruction, and represents by itself one bit
value, which numerical meaning determines its
destination. Let’s consider an example:
code segment
……
jmp metka ;the walk [обход] of the field sdq
;(compulsory!)
sdq db 4; description of data field
metka:
……..
mov AL, CS:sdq ; this redeclaration allows to work
; with data inside the code segment
 Operator of obtaining segment component of an
address of expression seg. It returns a physical
address of a segment for some expression (the expression
may be: label, variable, name of group or any symbolic
name). The syntax diagram of this operator:
seg
Expression
 Operator of obtaining offset of expression offset. It
allows to obtain an offset of the expression in bytes (an
offset relatively the beginning of those segment, in which
this expression is located).
Example of using these operators:
data segment
smth dw 8
……
code segment
…….
mov AX,seg smth
mov DS,AX
mov DX, offset smth ; now in the couple DS:DX
; we have got the full physical address of smth
Problems.
1. Which types of sentences is it possible include in assembly
program?
2. Describe the general structure of an EXE-format program.
3. Why is it necessary include int 21h in assembly program?
Download