-----------------------------INLINE DOCUMENTATION-----------------------------Interfacing Scheme to assembly code, by Larry Bartholdi ------------------------------------------------------------------------------Typically, scheme users (as all high-level language users) are allowed little if no knowledge of the whats and the hows of their system. The reason is somewhat ideological (opposing 'hackers' to 'theoreticians'), but is also a safety concern. Using a lower level allows greater freedom but meaner bugs, and the software engineer should always consider the high-level approach first. Nevertheless, PC Scheme/Geneva allows full access to lower levels through a special list structure, that can be executed. The lists that can be executed are of two kinds: PCS-CODE-BLOCKS (PCB) and PCS-INLINE-BLOCKS (PIB). A PCB is defined thus: (list 'PCS-CODE-BLOCK #symbols #codebytes (list <symbol> ...) (list <codebyte> ...)) It is supposed to be executed by the PCS virtual machine (VM). The symbols are accessible from the code through the mov_constant instruction, and the code bytes are of course the instructions to be executed. If this sounds too abstract, try ``(compile '(+ a 1))'', and you will see: (PCS-CODE-BLOCK 1 10 (A) (2 4 1 6 8 0 80 4 8 59)) This is a code block containing 1 symbol (A), and 10 code bytes, reading 2 4 1 6 8 0 80 4 8 59 : load-immediate 1 : lookup-value of A : add up the values : return The complete instruction set can be found in sources.asm\assembly.ash. Generally speaking, these instructions are executed by a 64-register tagged virtual machine. Scheme code is translated in a highly optimised VM code block, so little can be gained by hand-coding inner loops. If you don't trust me, give a look to recur.s. It contains a good scheme routine and an optimal VM machine. The performances match to about 10%, while the development times differred by a factor of 10 ! Note also that VM instructions might differ from one version of PCS to another, so you might have to rewrite your code if you use PCB directly... Of greater interest are PIBs. They are defined as: (list 'PCS-INLINE-BLOCK #bytes (list <byte> ...)) The byte list corresponds to 8086 instructions. As a PIB is executed, control is passed to the first byte of the list. The assembly subtask must then return with a far return (#hCB). All registers may be trashed except SS and SP (and of course CS and IP!). A trivial example is then: (try it!) (%EXECUTE '(PCS-INLINE-BLOCK 1 (#hCB))) ---> undefined value What is now needed is parameter passing. This is provided by a special construct, (inline-lambda (args ...) P[IC]B). Inline-lambda takes an argument list (or a number of arguments) and executes the block it was passed with the arguments spreaded over VM registers R1R61. The assembly or VM code can then access the registers, modify them, etc. and return a value in register R1. Again a simple example: ((inline-lambda (n) '(pcs-inline-block 1 (#hCB))) 1234) would return 1234, because the value passed in R1 (1234) is not changed. To code less trivial programs, some definitions, macros and aliases are needed. They can be found in sources.asm\inline.ash. inline.ash defines equates for VM registers R0-R63 (R0 is nil, and R63 is reserved), structures to access these registers, and helper functions. A VM register is defined as a base and a displacement into VM's paginated memory. The bases are 8-bit values used as indexes in a segment table. To examine the object pointed to by a register, one would typically use: mov call mov mov bx, [reg1.page] ldpage C, bx es, ax bx, [reg1.disp] ; get the register's page ; get the corresponding segment in bx ; get the displacement part of the reg. and the data can be found at [es:bx]. The objects can be of various types: fake objects like short integers (signed 16-bit) which are just their displacement value, or true objects composed of a tag (1 byte), a length (2 bytes) and some data. The length includes the tag and the length bytes. Many objects contain pointers, which are just another representation for a register, except the register is 32-bit with disp first, while the pointer is 24-bit with page first. The tag-length-data motto is violated by two objects: lists and floatingpoint numbers, whose size is fixed. A pair is just two pointers, while a flonum is a tag followed by the 64-bit data. My... You can recognize or allocate a new Scheme object specifying its type (tag): LISTTYPE = 0 FIXTYPE = 2 FLOTYPE = 4 BIGTYPE = 6 SYMBTYPE = 8 STRTYPE = 10 VECTTYPE = 12 CONTTYPE = 14 CLOSTYPE = 16 FREETYPE = 18 CODETYPE = 20 I86TYPE = 22 PORTTYPE = 24 CHARTYPE = 26 ENVTYPE = 28 Any other value would cause unpredictable result... See INLINE.ASH for more details on internal structure of VM objects. Reg0-reg63 are aliases; an assembly code block is given control with DS:SI pointing to register R0, so SI must be preserved as long as the routine wants to access registers. DI is also used and points to a dispatch table allowing dynamic linking to the most important PCS routines, here documented by example: call ldpage C, PAGE returns a segment descriptor in ax call alloc_big_block C, REGPTR, TYPE, SIZE allocates a big memory block (> 1/128 of total memory) REGPTR is a register's address that will point to the new object. alloc_block C, REGPTR, TYPE, SIZE allocates a variable-length object. alloc_flonum C, REGPTR, FLOVALUE allocates and stores a 64-bit FP value alloc_int C, REGPTR, BIGNUM_buffer ditto - where a BIGNUM buffer is a size (1 word) in words, a sign (1 byte) 0 or 1, and the data, LSB call call call first. call call call alloc_list_cell C, REGPTR finds storage for a pair alloc_string C, STRING ditto - the string is null-terminated cons C, REGPTR1, REGPTR2, REGPTR3 allocates a pair, sets its car to REGTR2 and its cdr to REGPTR3, and makes REGPTR1 point to the pair. call free C, memory block the standard C function call getch C guess what get_max_cols C returns the screen's width get_max_rows C returns the screen's depth call call call int2long C, REGPTR reads a number from the register REGPTR and returns it in DX:AX (32-bit signed). No checks are done. call is_graph_mode C returns 1 if graphics mode, 0 if alphanumerical call long2int C, REGPTR, 32VALUE sets register REGPTR to the 32-bit signed value 32VALUE call malloc C, SIZE the standard C function nosound C turns off the speaker sound C, FREQUENCY turns on the speaker zcuroff C turns off the cursor zcuron C turns on the cursor zprintf C, FORMAT, ... like C's printf zputc C, CHAR write a character to the screen zscroll C, LINE, COL, NLINES, NCOLS, ATTRIBUTE scrolls the screen area up 1 line zscroll_d C, (ditto) scrolls the screen area down 1 line call call call call call call call call Among the tricks to know is that an assembly block may be placed at anytime in memory, so all local variables must be accessed pc-relative, with the code: call pop sub mov ... mydata $+3 bp bp, $-1 ax, [bp+mydata] dw ? An example of this is given in inline\snow.asm. Besides providing useful equates, inline.ash defines two macros, startinline name, argcount and endinline These construction blocks delimit an assembly subroutine. The trivial example would be coded: ;---------------------------------------------- start of TRIVIAL.ASM IDEAL INCLUDE "inline.ash" StartInline TRIVIAL, -1 ; -1 means any argument count (mu-lambda) ret EndInline END ;---------------------------------------------- end of TRIVIAL.ASM and assembled with TASM trivial TLINK /t trivial, trivial.bin now type PCS (load "trivial.bin") (trivial 1 2 3) ----> (1 2 3) (exit) You'll find some other examples in the inline\ subdirectory. Note how to share tasks between Scheme and assembly code in PEEK.ASM and PEEK.S : Typically, type checking, link to the debugger and user interface should ALWAYS be written in pure Scheme. You should only use assembly code to do things you wouldn't be able to do in scheme (in our exemple, direct memory and port access). The compiled version, PEEK.BIN and PEEK.FSL, is located in PCS system directory but since they make an attempt against PCS decency, they are not spontaneously loaded; use (load "peek.fsl") if needed. Please send any comments to schemege@uni2a.unige.ch, and, most of all, good luck ! 20 nov 1992 (lb) (*) Borland products (TASM, ...) are trademarks of Borland International Inc.