SIC Assembly Macro Processor

advertisement

SIC Assembly Macro Processor

Version 1.1

The purpose of this document is to explain the SIC assembly macro processor, its abilities, and how to use it.

Compiling the SIC macro processor

The macro processor comes as a package of 28 files. Three files, y.tab.c, y.tab.h, and lex.yy.c, are generated during compile time. The macro processor makes extensive use of a bottom up parser generated by YACC (or a similar program), and a lexical analyzer generated from LEX or Flex. Therefore, it is necessary that either YACC, or a similar program such as Bison, and LEX or

Flex, be installed on the system intending to compile the macro processor.

Most Linux packages come with Bison and Flex. However, you may encounter systems that have YACC and AT&T LEX. Unfortunately, Bison/YACC and

LEX/Flex are different enough so that minor changes need to be made before compiling depending on what service you have installed.

If you ever move the program from one computer to another, and it compiles fine, runs, but seems to hang in an infinite loop, then most likely one of these changes need to be made.

One of the header files comprising the macro processor is "version.h". Within that short header file, there will by symbolic constants defined. Uncomment the line containing the version of parser generator you have, and the version of lexical analyzer generator you have.

There is one more change you may need to make. In the make file there is a line near the top that looks like:

LIBRARY=lfl

If you have AT&T LEX, change this line to LIBRARY=ll , otherwise, it should be

LIBRARY=lfl .

Issue a "make" command, and it should work.

Executing the SIC Macro Processor

This section covers the basic use of the macro processor and its command line options.

The most common way to execute the macro processor is: mp filename where "filename" is the file being pre-processed. The resulting code will be written to a file with the same name with an added *.asm extension. The macro processor can take an unlimited number of files on the command line. So a command like: mp filename1 filename2 filename3 . . . filenameN is perfectly ok. Each file will be pre-processed independently of the others and written to files filename1.asm filename2.asm, etc.. Therefore, any macros defined in "filename1," for example, will not be available to "filename2." The macro processor can make use of a library file, however. A library file is a file full of SIC macros written one after the other, in no particular order. The library file will be read, and the macros in it will be available to any file specified on the command line AFTER the library file was read. Library files are designated by a

" -l " on the command line before the library file name. Therefore, mp -l library will load the library called "library", but no output will be generated because no source files were specified after that. mp -l lib filename will load the library called "lib" and process the file "filename." Any macro call requested that is defined in "lib" will therefore be found and expanded into the final source. There may occur times where there are conflicts between macros defined in a library, and macros defined locally within the source file.

For example, in the previous example, "lib" could contain a macro called PUTC. The source file "filename" could also contain a macro by the same name. If this occurs, the local macro (the one in the source) takes precedence, and it is that definition that will be expanded.

There may also occur times when two library files specified on the command line have a macro with the same name.

If this happens, the ambiguity is solved by using the macro in the last library file specified.

So if there is a macro called PUTC in library "lib" and a macro called PUTC in a library called "lib2", and this is what the command line looks like: mp -l lib -l lib2 source

Then the PUTC in "lib2" will be used for any call for a PUTC expansion.

In summary of how to use the macro processor, here are some example command lines and their meanings: mp source1

Process the file "source1" and output the results to "source1.asm" mp -l lib source1

Process the file "source1", using the library file "lib", and output the results to source1.asm. mp source1 source2 -l lib source3

Process the source files "source1" and "source2" independently of each other without referencing any libraries, and output the results to

"source1.asm" and "source2.asm" respectively. Then, load a library file called "lib" and process source file "source3", writing those results to

"source3.asm" mp -l source1 source2

Use "source1" as a library file, and process "source2" writing the results to "source2.asm." What happens if you use a source file as a library?

The file will be scanned for macro definitions. Any other code is ignored.

Macro Processor Abilities

The macro processor encompasses all of these features:

1. Set Variables

2. System Qualifiers

3. Concatenation of Symbols and Limited Expressions

4. Recursive Defines (define macros within macros)

5. Recursive Expansions (expanding macros within macros)

6. Positional and Named Parameters

7. Conditional Assembly with Forward Jump Points

8. Infinite Loop and Infinite Recursion Protection

See the macro processor documentation that is provided with the class for the meaning of these features.

There are some subtleties to forming expressions that the macro processor can understand. One of the reasons that these subtleties occur is because expressions can include string constants, variables, numbers, and system functions, and these expression can potentially be tacked onto any column of the

SIC code. That means that labels, opcodes, operands, and even comments are processed for expressions and can potentially be dynamic. Ambiguities can arise though, and measures have to be taken to make sure things are interpreted ok.

For example take:

(&MSG,X), HI

If the value of &MSG is HELLO. Then the expression above will be translated into:

HELLO,X, HI

Where did the parentheses go? Sometimes parentheses are meant to be kept in the code, and sometimes they are not. In this case, most likely they were meant to be kept in. Something like: (&MSG~"BYE BYE"), HI would most likely be meant to have no parentheses. But who knows for sure. Maybe the programmer wanted them there for some reason.

To solve it, we treat both cases the same, as small expressions the parser can handle (see documentation of the grammar). Then, in both cases, the parentheses are stripped off. To solve the problem, more parentheses are added where needed.

( (&MSG,X) ), HI now would produce: (HELLO,X), HI

The blue text is treated as an expression, while the other parentheses are copied into the final code.

Its worth documenting the grammar used for conditional assembly. This will give some idea of what can be done, and what will produce parser errors. The grammar is: expression -> exp exp -> exp relop sub_exp exp -> exp logicalop sub_exp exp -> sub_exp exp -> NOT sub_exp sub_exp -> sub_exp addop term sub_exp -> term term -> term mulop factor term -> factor factor -> (exp) factor -> var routine -> %NITEMS() routine -> %NITEMS(var) routine -> %LENGTH(var) routine -> %SUBSTR(var, pvar, pvar) var -> id | string_const | string | number | routine var -> st_index | csl_index pvar -> id | number id -> &ID string_const -> "STRING" string_const -> STRING number -> signed | usigned usigned : NUM signed : +NUM | -NUM relop -> LT | GT | GTE | LTE | EQ | NE logicalop -> AND | OR mulop -> * | / | MOD addop -> + | - | ~ | COMMA param_list -> param_list param param_list -> param param -> var st_index -> id[param_list] csl_index -> id{var}

So as seen above, an expression such as 5+5 is legal and will parse. But an expression like 5++5 will produce a parsing error, and whole expression will default to 0.

System functions such as %SUBSTR() will work just fine in the case of

%SUBSTR("HELLO", 0, 2). This expression produces a result of "HE". However,

this expression %SUBSTR("HELLO", 0) will be a parse error. From the grammar you can see that %SUBSTR() requires one "var" parameter, which can be a string constant, a number, a normal string, or another routine such as

%LENGTH(), and two "pvar" parameters which should be id's or numbers.

As a basic rule, if an expression is grammatically correct, that is, it produces NO parse errors, but somewhere during evaluation, when the internal parse tree is being traversed, the macro processor finds that it can't find a value of an ID or a parameter, it will default to 0 or an empty string. If you are confused as to why the macro processor is giving a certain result, it may be helpful to realize that it will default to 0 if it can't make sense of it. For example, the code below is grammatically valid:

&VAR SET "HELLO"

&VAR2 SET %SUBSTR(&VAR, 5, 1)

IF &VAR2 EQ "O"

&VAR3

… do something in here …

ENDIF

SET "HELLO"+7

There are two examples here. In the first one, maybe the person was trying to extract the last letter of HELLO, and put it in the set variable &VAR2, but forgot that the indices start with 0. So in this case its grammatically correct, but impossible for the macro processor to figure out. The default is 0, or in this case, the empty string. The second example doesn't make sense because you can't add a string and a number, but according to the grammar its legal. So the macro processor will translate the "HELLO" as 0, and the result will return 0+7.

Infinite Loop Protection

The macro processor will try to protect the programmer from horrible mistakes such as these:

&LABEL

MACRO

MYMACRO

IF

MYMACRO

ENDIF

MEND

. expansion call

MYMACRO

&P1

&P1 EQ "HI"

&P1

"HI"

Or something like:

WHILE

LDA

ENDW

1

B

Both examples produce an infinite loop in the conditional assembly. The only way to keep code like this from hanging up the macro processor indefinitely (and from producing huge files in your space), is to impose loop limits. By default they are

250 loops for WHILE and 100 recursive calls. After these limits are reached, the macro processor steps in and breaks the cycle.

There may be times though, when a loop larger then 250 is needed. In this case, there are two options. One is to change the constant in the code and recompile, but the other is to use the "-d" command on the command line. This tells the macro processor to shut off loop protections.

BE WARNED though! If you shut it off and there is an infinite loop in your code, the macro processor WILL enter into an infinite loop. So, be careful!

So, by entering: mp -l lib -d sourcefile

You are telling the macro processor to load the "lib" library in the current directory, and process the file "sourcefile" without and looping safeties.

Download