(Chapter 5)
Tami Meredith, Winter 2015
We write .asm files containing ASCII (i.e., text) versions of our program
MASM assembles our .asm file into a .obj file – relocatable, unlinked, Intel32 binary code
All the .obj files are linked to create a relocatable executable – a .exe file
The .exe file is loaded into main memory, addresses are resolved, and the program is executed
Procedure : Same thing as a "method" in java or a "function" in
C – just a bunch of code to do a specific task
Link Library : A bunch of .obj files merged together
A file containing procedures that have been compiled into machine code constructed from one or more OBJ files
OBJ files are assembled from ASM source files
Library built using the Microsoft LIB utility (or similar tool)
Irvine32.lib
is an example of a link library
Library is linked (i.e., joined) to your .asm file when you build your project
• Call/Use a (library) procedure using the CALL instruction
• Some procedures require input arguments, which must be pre-placed in the proper location => a register
• The INCLUDE directive copies in the procedure prototypes (same thing as #include <stdio.h> )
INCLUDE Irvine32.inc
.code
mov eax, 1234h call WriteHex call Crlf
; input argument
; show hex number
; end of line
Your programs link to Irvine32.lib
using the linker command inside a batch file named make32.bat.
Notice the two LIB files: Irvine32.lib
and kernel32.lib
(Part of the Microsoft Win32 SDK)
Your program can link to links to
Irvine32.lib
links to kernel32.lib
executes kernel32.dll
CloseFile – Closes an open disk file
Clrscr - Clears console, locates cursor at upper left corner
CreateOutputFile - Creates new disk file for writing in output mode
Crlf - Writes end of line sequence to standard output
Delay - Pauses program execution for n millisecond interval
DumpMem - Writes block of memory to standard output in hex
DumpRegs – Displays general-purpose registers and flags (hex)
GetCommandtail - Copies command-line args into array of bytes
GetDateTime – Gets the current date and time from the system
GetMaxXY - Gets number of cols, rows in console window buffer
GetMseconds - Returns milliseconds elapsed since midnight
GetTextColor - Returns active foreground and background text colors in the console window
Gotoxy - Locates cursor at row and column on the console
IsDigit - Sets Zero flag if AL contains ASCII code for decimal digit (0 –9)
MsgBox, MsgBoxAsk – Display popup message boxes
OpenInputFile – Opens existing file for input
ParseDecimal32 – Converts unsigned integer string to binary
ParseInteger32 - Converts signed integer string to binary
Random32 - Generates 32-bit pseudorandom integer in the range 0 to
FFFFFFFFh
Randomize - Seeds the random number generator
RandomRange - Generates a pseudorandom integer within a specified range
ReadChar - Reads a single character from standard input
ReadDec - Reads 32-bit unsigned decimal integer from keyboard
ReadFromFile – Reads input disk file into buffer
ReadHex - Reads 32-bit hexadecimal integer from keyboard
ReadInt - Reads 32-bit signed decimal integer from keyboard
ReadKey – Reads character from keyboard input buffer
ReadString - Reads string from standard input, terminated by <Enter>
SetTextColor - Sets foreground and background colors of all subsequent console text output
Str_compare – Compares two strings
Str_copy – Copies a source string to a destination string
StrLength – Returns length of a string
Str_trim - Removes unwanted characters from a string
Str_ucase - Converts a string to uppercase letters
WaitMsg - Displays message, waits for Enter key to be pressed
WriteBin - Writes unsigned 32-bit integer in ASCII binary format.
WriteBinB – Writes binary integer in byte, word, or dword format
WriteChar - Writes a single character to standard output
WriteDec - Writes unsigned 32-bit integer in decimal format
WriteHex - Writes an unsigned 32-bit integer in hexadecimal format
WriteHexB – Writes byte, word, or dword in hexadecimal format
WriteInt - Writes signed 32-bit integer in decimal format
WriteStackFrame Writes the current procedure’s stack frame to the console.
WriteStackFrameName Writes the current procedure’s name and stack frame to the console.
WriteString - Writes null-terminated string to console window
WriteToFile - Writes buffer to output file
WriteWindowsMsg - Displays most recent error message generated by
MS-Windows
DON'T memorise these. Just know what can be done and be able to look them up for argument/parameter details (pgs
134-149, mostly, in alphabetical order)
Clear the screen, delay the program for 500 milliseconds, and dump the registers and flags
.code
call Clrscr mov eax,500 call Delay call DumpRegs
Sample output:
EAX=00000613 EBX=00000000 ECX=000000FF EDX=00000000
ESI=00000000 EDI=00000100 EBP=0000091E ESP=000000F6
EIP=00401026 EFL=00000286 CF=0 SF=1 ZF=0 OF=0
Display a null-terminated string and move the cursor to the beginning of the next screen line.
.data
str1 BYTE "Bus Strikes Really Suck!",0
.code
mov edx,OFFSET str1 call WriteString call Crlf
call Crlf
Display a null-terminated string and move the cursor to the beginning of the next screen line (use embedded CR/LF)
.data
str1 BYTE "The lab was too long!",0Dh,0Ah,0
.code
mov edx, OFFSET str1 call WriteString
Display an unsigned integer in binary, decimal, and hexadecimal, each on a separate line testVal = 35
.code
mov eax, testVal call WriteBin call Crlf call WriteDec call Crlf call WriteHex call Crlf
; display binary
; display decimal
; display hexadecimal
Sample output :
0000 0000 0000 0000 0000 0000 0010 0011
35
23
• Input a string from the user
• EDX points to the string and ECX specifies the maximum number of characters the user is permitted to enter
Note: null (zero) byte is automatically added by ReadString
.data
fileName BYTE 80 DUP(0)
.code
mov edx, OFFSET fileName mov ecx, SIZEOF fileName – 1 call ReadString
• Generate and display ten pseudo-random (semi-random) signed integers in the range 0 – 99
• Pass each integer to WriteInt (via EAX) and display it on a separate line
.code
mov ecx,10 ; loop counter genNum: mov eax,100 call RandomRange call WriteInt call Crlf loop genNum
; ceiling value
; generate random int
; display signed int
; goto next display line
; repeat loop
Display a null-terminated string with yellow characters on a blue background.
.data
str1 BYTE "Yanks can't spell colour!",0
.code
;Low 4 bits = foreground
;Next 4 bits = background
;00000000 00000000 00000000 bbbbffff mov eax, yellow + (blue * 16) call SetTextColor mov edx, OFFSET str1 call WriteString call Crlf
Imagine a stack of plates: plates are only added to the top = "pushed" on the stack plates are only removed from the top = "pulled" from the stack
LIFO structure – "Last In, First Out" top
10
9
8
7
6
5
4
3
2
1 bottom
Managed by the CPU, using two registers
SS (stack segment) – Segment being used for stack
ESP (stack pointer) – Pointer/Address/Offset of TOP of Stack
In reality, the stack is actually “upside down”
Offset
00001000
00000FFC
00000FF8
00000FF4
00000FF0
00000006
ESP
PUSH
1. A 32-bit push operation decrements the stack pointer by 4, and
2. Copies a value into the location pointed to by the stack pointer
BEFORE
00001000
00000FFC
00000FF8
00000FF4
00000FF0
00000006
ESP
00001000
00000FFC
00000FF8
00000FF4
00000FF0
AFTER
00000006
000000A5
ESP
After pushing two more integers:
Offset
00001000
00000FFC
00000FF8
00000FF4
00000FF0
00000006
000000A5
00000001
00000002
ESP
The stack grows downward (into LOWER addresses/offsets)
The area below ESP is always available (unless the stack overflows)
Overflow: When segment is filled (and no more space is available)
POP
1.
Copies value at stack[ESP] into a register or variable, and
2.
Adds n to ESP, where n is either 2 or 4 (depending on size of destination)
BEFORE
00001000
00000FFC
00000FF8
00000FF4
00000FF0
00000006
000000A5
00000001
00000002
ESP
AFTER
00001000
00000FFC
00000FF8
00000FF4
00000FF0
00000006
000000A5
00000001
ESP
PUSH
POP
PUSH syntax:
1. PUSH r/m16
2. PUSH r/m32
3. PUSH imm32
POP syntax:
1. POP r/m16
2. POP r/m32 r/m = register/memory
PUSH
POP
• Save and restore registers when they contain important values
• Sort of like anonymous, automatic, memory locations
• PUSH and POP instructions occur in the opposite order (LIFO, or FILO)
• Very important ADT in computing
; push registers push esi push ecx push ebx mov esi,OFFSET dwordVal mov ecx,LENGTHOF dwordVal mov ebx,TYPE dwordVal call DumpMem pop ebx pop ecx pop esi
; display some memory
; restore registers
Idea:
Use stack to save ecx (loop counter) of outer loop when in inner loop
>> push the outer loop counter before entering the inner loop mov ecx, 100 outer : push ecx
; set outer loop count
; begin the outer loop
; save outer loop count mov ecx, 20 inner :
; set inner loop count
; begin the inner loop
… Code for inner loop goes here … loop inner ; repeat the inner loop pop ecx loop outer
; restore outer loop count
; repeat the outer loop
1. PUSHFD and POPFD push and pop the EFLAGS register
2. PUSHAD pushes the 32-bit general-purpose registers on the stack order: EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI
3. POPAD pops the same registers off the stack in reverse order
4. PUSHA and POPA do the same for 16-bit registers
Large problems can be divided into smaller tasks to make them more manageable
A procedure is the ASM equivalent of a Java Method, C/C++
Function, Basic Subroutine, or Pascal Procedure
Same thing as what is in the Irvine32 library
The following is an assembly language procedure named sample : sample PROC
… Code for procedure goes here … ret sample ENDP
SumOf
Example of BAD documentation (from the book)
;-----------------------------------------------------
SumOf PROC
;
; Calculates and returns the sum of three 32-bit ints
; Receives: EAX, EBX, ECX, the three integers
; may be signed or unsigned.
; Returns: EAX = sum
; status flags are changed.
; Requires: nothing
;----------------------------------------------------add eax,ebx add eax,ecx ret
SumOf ENDP
SumOf
; Calculates the (signed or unsigned)
; integer sum: EAX = EAX + EBX + ECX
; Returns: EAX = sum, changes eflags
SumOf Proc add eax, ebx add eax, ecx ret
SumOf ENDP
Comments are only useful if they add value ! A short intro is useful when a procedure is long and complex. Lots of fancy formatting just takes longer to do and decreases the likelihood that you will create comments.
CALL
RET
The CALL instruction calls a procedure
1.
pushes offset of next instruction on the stack (saves the value of the instruction pointer)
2. copies the address of the called procedure into EIP (puts the address of the procedure into the instruction pointer)
3. Begins to execute the code of the procedure
The RET instruction returns from a procedure
1.
pops top of stack into EIP (over-writes instruction pointer with the value of the instruction after the call)
CALL
RET
0000025 is the offset of the instruction immediately following the CALL instruction main PROC
00000020 call MySub
00000025 mov eax,ebx
.
.
main ENDP
00000040 is the offset of the first instruction inside
MySub
MySub PROC
00000040 mov eax,edx
.
.
ret
MySub ENDP
CALL
RET
The
CALL instruction pushes 00000025 onto the stack, and loads 00000040 into
EIP
00000025 ESP
CALL =
PUSH eip
MOV EIP, OFFSET proc
The
RET instruction pops 00000025 from the stack into EIP
RET = POP eip
00000025 ESP
00000040
EIP
00000025
EIP
(stack shown before RET executes)
main PROC
.
.
call Sub1
exit main ENDP
Sub1 PROC
.
.
call Sub2
ret
Sub1 ENDP
Sub2 PROC
.
.
call Sub3
ret
Sub2 ENDP
Sub3 PROC
.
.
ret
Sub3 ENDP
By the time Sub3 is called, the stack contains all three return addresses:
(ret to main)
(ret to Sub1)
(ret to Sub2)
ESP
1. A local label is visible only to statements inside the same procedure
2. A global label is visible everywhere main PROC jmp L2 ; error call sub2
L1:: ; global label exit main ENDP sub2 PROC
L2: ; local label jmp L1 ; legal, but stupid ret ; When is ret ever called?
sub2 ENDP
The ArraySum procedure calculates the sum of an array. It makes two references to specific variable names :
ArraySum PROC mov esi,0 mov eax,0
; array index
; set the sum to zero mov ecx,LENGTHOF myarray ; set number of elements forEach: add eax, myArray [esi] add esi,4 loop forEach mov ret theSum
ArraySum ENDP
,eax
; add each integer to sum
; point to next integer
; repeat for array size
; store the sum
This procedure needs parameters so that the array name and result location can be passed in/out and permit the function to be used with different arrays.
This version of ArraySum returns the sum of any doubleword array whose address is in ESI. The sum is returned in EAX:
; Add an array of doublewords
; ESI = address of array, ECX = no. of elements
; Returns: EAX = sum; ECX, ESI, & flags changed
ArraySum PROC mov eax,0 ; set the sum to zero forEach: add eax,[esi] add esi,4 loop forEach ret
ArraySum ENDP
; add each integer to sum
; point to next integer
; repeat for array size
"Control" flow diagrams
An old, but still somewhat effective, technique
Basis of UML 2.0 Activity Diagrams (basically same thing)
The following symbols are the basic building blocks of flowcharts begin / end manual input display process (task) procedure call decision no yes
ArraySum Procedure begin push esi, ecx eax = 0
AS1: add eax,[esi] add esi, 4 ecx = ecx
1
push esi
push ecx
mov eax,0
AS1:
add eax,[esi]
add esi,4
loop AS1
pop ecx
pop esi yes ecx > 0?
no pop ecx, esi end
Lists the registers that are used by a procedure
MASM inserts code that will try to preserve them
ArraySum PROC USES esi ecx mov eax,0 etc.
; set the sum to zero
MASM generates the code shown in gold :
ArraySum PROC push esi push ecx
.
.
pop ecx pop esi ret
ArraySum ENDP
*** Don't PUSH/POP the register used for the return value!
The sum of the three registers is stored in EAX on line (3),
The POP instruction replaces it with the starting value of EAX on line (4):
SumOf PROC push eax add eax,ebx add eax,ecx pop eax ret
SumOf ENDP
; sum of three integers
; 1
; 2
; 3
; 4 Oh Noes! Replaced the sum!
We will not be using the USES directive. I want you to manually store/save any registers that need preserving during a procedure call.
Top-Down Design (functional decomposition) involves the following:
1.
design your program before starting to code
2.
break large tasks into smaller ones
3.
use a hierarchical structure based on procedure calls
4.
test individual procedures separately
The Problems …
1.
Assumes programmer has a strong understanding of the necessary and correct architecture
2.
Important decisions made early – mistakes thus costly
3.
Assumes hierarchical structure is actually possible
4.
All initial work on design, none on coding, nothing to show for large periods of time
5.
Highly likely to fail or be ineffective on large projects
Bottom-Up Design (functional synthesis) involves the following:
1.
pick one small part of the program, write a procedure to perform it
2.
repeat until a collection of small parts is formed
3.
write a procedure to join several small parts
4.
continue until all the parts implemented and joined
• Build some Lego blocks, connect them, build some more …
Some comments …
1.
Design tends to emerge, not necessarily planned
2.
Gets parts working sooner, easier to spot programming road-blocks
3.
Hard to know how big the parts should be
4.
Can ignore some parts if time runs out (and they are not important)
5.
Not good for teams as no plan exists
6.
Generally safer, but less overall structure
Reality: A mixture of bottom-up implementation and top-down design, opportunistic approaches, much personal preference exists
Description: Write a program that prompts the user for multiple 32-bit integers, stores them in an array, calculates the sum of the array, and displays the sum on the screen.
Main steps (Decomposition):
1. Prompt user for multiple integers
2. Calculate the sum of the array
3. Display the sum
Main
Clrscr ; clear screen
PromptForIntegers
WriteString
ReadInt
ArraySum
DisplaySum
WriteString
WriteInt
; display string
; input integer
; sum the integers
; display string
; display integer
(architecture diagram)
Note: gray indicates library procedures
Summation
Program (main)
Clrscr PromptForIntegers ArraySum
WriteString ReadInt WriteString
DisplaySum