CS241 Notes on Chapter 2 of Sargent and Shoemaker Our assembler, the GNU portable assembler hosted on Intel 486, uses a different syntax than Intel intended. This writeup documents what needs to be changed in the book to reflect these differences. Luckily not everything is different. The register names are Intel-standard, for example. We'll call Gnu as "gas", although it's on the system as "i386-as", in a Gnu directory. We will be programming using only "protected mode", not "real mode". The text discusses both real and protected modes, to cover the older environments many programmers still need to know about. Protected mode is the mode in use by Windows95, Windows NT, and UNIX on i486, i.e. all the proper 32-bit operating system environments. This document tells you what you can skip because it's real-mode-only. p. 15-27: This is all basic material, all relevant. We tend to write hex numbers using the C syntax used by gas, "0x20" instead of 20h, but it's easy enough to read. The caption on Fig. 2-2 has only little-endian examples, in spite of its title "Little and big endian...". p. 28: Here's the first snippet of assembler. follows: It is written in gas as gas Intel syntax movl $8, %ax addl $3, %ax movl %eax, 0x200 mov al, 8 add al, 3 mov [200], al Note how the operands are reversed, so although Intel moves data from right to left, gas makes it go left to right. The machine instructions are the same -this is just the text syntax interpreted by the assembler. Also note how Intel syntax uses "mov" without specifying whether it's moving a byte, word, or doubleword, leaving that determination up to the assembler from various other clues such as the destination operand size. Gas expects this information with mov: movb, movw, or movl. The Gnu assembler does accept "mov", and does select between l and b based on its own obscure rules, but won't select w without explicit "w" specification. As you will see, gas is uncompromisingly 32-bit-oriented. Since we are also, this should be no problem. When you see "mov al, 8", it means load 8 into the A register, or "accumulator". This is a 16-bit operation in the real-mode world, but more importantly, it is the full size of that register. Thus the natural translation to the 32-bit world is to load 8 into the 32-bit A register, denoted EAX for extended AX, where AX is the old 16-bit A register. The final difference this code shows is in the reference to address 0x200 in memory. This difference disappears as soon as the location in memory is named, say "answer". Then in both cases we would simply use this name: "movl %eax, answer" or "mov answer, ax". Note that we want to use 32-bit operands by default. Whenever reasonable, we will expand out to the full 32-bit register size. Bytes are still useful, just as chars are in C. So we tend to use bytes and doublewords for all ordinary code, and words only when needed to talk to the hardware. (Those of you who know Intel assembler already may point out that "mov ax, 8" can be made to execute as 32-bit code by manipulating the program's computing environment, but let's not confuse the newcomers!) p. 30 The b8 08 83 c0 byte) a3 00 32-bit code takes more bytes to encode it: 00 00 00 00 (with 8 as a little-endian 32-bit number) 03 (surprisingly, the 3 can be encoded in a 02 00 00 (with 200 as little-endian 32-bit number) Thus the opcode for the first move is now b8 instead of b0, and for the second mov is a3 instead of a2. The add has changed from 04 to 83 c0 -- some instructions take multiple bytes before the embedded addresses or data. p. 34, Section 2-6 16-bit Register Set. Read this but go right on to p. 51, x86 32-bit Register Set. The latter is what we'll be using. p. 38, Section 2-7 The Segment Registers. Just skip this. It describes the real-mode use of segment registers, which is different from the protected-mode. Rest assured that you won't have to manipulate segment registers for ordinary programming. For our environment, like Windows 95/NT/UNIX, they are loaded once and for all by the boot-up code and only changed after that for very specialized actions. The code and data segments have the same setup, so code addresses and data addresses are in the same "space", just like MC68000 addresses. p. 41, Section 2-8. Some Simple Assembly-Language Examples. Our debugger is called Tutor. Because the code and data (and stack) segments are set up identically, segment prefixes are dropped from the addresses, so the addresses are simply numbers from 0 to 0x3fffff, for the 4MB of physical memory actually present on the machine. Thus once downloaded to 0x200000, the little sample program from above would look like this: Tutor> md 200000 200000: b8 08 00 00 00 00 83 c0 03 a3 00 02 00 00 xx xx Tutor can't disassemble this, but it can execute it one instruction at a time: Tutor> .eip 200000 (set eip to 200000) Tutor> t (trace 1 instruction) eip = 200006 Tutor> t eip = 200009 Tutor> rd (reg display) EAX= 00000008 ... Tutor> t eip = 20000e Tutor> md 200 ( look at addr 200) 00000200: 08 00 00 00 xx xx ... You can disassemble an object file with i386-objdump, for which the alias "disas" is defined. p. 44, the second example to sum 10 numbers. We can write it to be callable from C. In the sample below, the .text directive means start of code section. The .globl directive makes _sum10 an extern variable, accessable to C. The assembly code thus is rewritten: # sum10.s Sum of first 10 numbers .text .globl _sum10 _sum10: movl movl addint: incl cmpl jbe ret $1, %ecx $0, %eax addl %ecx,%eax %ecx $10,%ecx addint # 1 is the first integer to be added # Initialize the sum to zero # Add an int to the sum # inc the count by 1 # compare to decimal 10 # jump back if not # return to C caller and the C program which calls _sum10 looks like: /* tell C this is an extern function in another file */ extern int sum10(void); void main() { printf("sum of 10 ints is %d\n",sum10();); } Here we are using the fact that the values in C to be returned by a function (in this case the function sum10) are normally left in %eax, so the call to "sum10()" evaluates to the sum calculated in the assembly code. Also, it's OK for the function to clobber %eax, %ecx and %edx, as these are the C scratch registers, ones that C allows each function to clobber at will. It is up to the caller (main in this case) to save and restore the register if they want their contents to be maintained across the call. p. 45. Here is the second version of sum10.s: # sum10.s Sum of first 10 numbers .globl _sum10 .text _sum10: movl $10, %ecx # 10 is the first integer to be added movl $0, %eax # Initialize the sum to zero addint: addl %ecx,%eax # Add an int to the sum loop addint # dec ecx, loop if ecx not 0 ret To do the hello example, we need to output single chars, one at a time. This can be done by putchar(ch) and luckily it is pretty easy to call Gnu C from assembler. We can't use "int 21" because we don't have DOS in memory. DOS requires real mode execution, so it is incompatible with our environment. We could use the "out" instruction directly to some port, but that is unnecessarily low-level. Calling putchar from assembler. To call putchar(ch), we need to push the argument on the stack. Gnu C uses 32-bit storage for chars that are on the stack during calls (just as it does on the 68000), so we push a doubleword with the char code in its low byte. On return, we need to adjust the stack to remove the arg pushed on it. .text # note this sequence clobbers %eax, %ecx, %edx!! pushl %edx # assuming ch in %edx call _putchar # call putchar addl $4,%esp # adjust stack back However, as the comment says, this little sequence also clobbers the C "scratch registers", %eax, %ecx, and %edx. To provide a safe calling sequence, we need to save and restore these registers around the call: pushl %eax pushl %ecx pushl %edx pushl %edx call _putchar addl $4,%esp popl %edx popl %ecx popl %eax # save C scratch regs # assuming ch in %edx # call putchar # adjust stack back # restore C scratch regs We see that %edx gets pushed twice here, so this could be optimized to: pushl %eax pushl %ecx pushl %edx call _putchar popl %edx popl %ecx popl %eax # save C scratch regs # assuming ch in %edx # call putchar # restore C scratch regs The complete code for this example now looks like: # hello.s--print hello as 5 individual chars in a loop .globl _hello .text _hello: movl $msg, %eax # point eax at string to output movl $5,%ecx # ecx has # chars to output movl $0,%edx # clear out edx dochar: movb (%eax),%dl # get a char from memory # note putchar clobbers eax, ecx and edx-pushl %eax # save %eax pushl %ecx # save %ecx pushl %edx # save %edx and provide putchar arg call _putchar popl %edx popl %ecx popl %eax #end of putchar sequence incl %eax loop dochar ret .data msg: .asciz "hello" # restore %edx # restore %ecx # restore %eax # point eax at next char # loop back until 5 chars done Notice the .asciz directive with label msg: at the end of the code. This forms a string of characters (five in this case) terminated in a zero byte. If we just want to print a string out, we can use printf-<save scratch regs in use> pushl $msg call _printf addl $4, %esp <restore scratch regs in use> For fancier printf's, just push the args in reverse order-<save scratch regs in use> pushl %eax # to print this in hex pushl $format # ptr to format string call _printf addl $8, %esp <restore scratch regs in use> ... .data format: asciz "val is %x" p. 47 Single-char input program: Here we call the C lib's getchar. # inchar.s Input chars until Q typed .globl _start .text _start: call _getchar # get 1 char, echo it cmpb $'Q', %al # is it a Q? jnz _start # no, loop back ret p. 49 Here is the closest translation of this example: # binhex1.s Convert binary 0x3d to two hex digits, print them .global _binhex .text _binhex: movl $0, %eax # Start with cleared eax movb $0x3d, %al # Put value to display in al movb %al, %dh # Save copy of val in dh shrb call movb andb call ret $4, %al dodigit %dh, %al $0xf, %al dodigit # Shift high nibble into low # Conv. to ASCII and print # Move number into al again # Zero out high nibble # Conv. to ASCII and print # return to C call of binhex # do one digit: convert to ASCII code, output dodigit: addb $0x30, %al # Add '0' to conv to ASCII cmpb $0x39,%al # is digit between 0 and 9? jbe digdone # Conv. complete if so addb $7,%al # Add 7 more if digit is A-F digdone:pushl %edx # Save %edx (%ecx not in use here) pushl %eax # arg to putchar: char to print call _putchar # call putchar addb $4, %esp # adj. stack for arg popl %edx # restore %edx ret # return to caller We can easily generalize this by passing the number to be printed as an arg to binhex. Since we know args are pushed on the stack, we just go to the stack for the char. It will be just under the return address, thus at 4 bytes past the current stack pointer value, at address 4(%esp) -- just like 68000 except that this is little-endian. Remember, we are looking at the low byte of the doubleword on the stack from locations 4(%esp) through 7(%esp). # binhex.s--print out byte provided by caller, in hex .global _binhex .text _binhex: movb 4(%esp), %al # Put byte to display in al movb %al, %dh # Save copy of val in dh ... same as above This would work fine. in $pcex. A further refinement is used as well in binhex2.s C program: extern void binhex(unsigned char c); main() { binhex(0x3d); } Or, more generally, scanf in a number from the user and pass it to binhex. Glossary clobber: destroy a value by overwriting it with another value, usually used for register contents in assembly language, but also occasionally used in C programming as well: update_function(&x); /* probably "clobbers" int x */ C scratch register: a register that may be clobbered by a C-compatible function. C uses the scratch regs for intermediate values in computations, and does not restore them back to their original values before the function returns. The C scratch registers for i386-gcc are eax, ecx, and edx. (For the UNIX Sun3 (68020 CPU) gcc, they are d0, d1, a0, and a1.) C stable register: A general register that is not a C scratch reg. If a C-compatible function changes a stable register, it must restore it to its original value before returning. General register: One of the CPU registers that can be accessed by ordinary assembler programs for loading and storing values. See S&S, p. 52 for our case.