Assembly Language Part 3 Symbols • Symbols are assembler names for memory addresses • Can be used to label data or instructions • Syntax rules: – – – – start with letter contain letter & digits 8 characters max CASE sensitive • Define by placing symbol label at start of line, followed by colon Symbol Table • Assembler stores labels & corresponding addresses in lookup table called symbol table • Value of symbol corresponds to 1st byte of memory address (of data or instruction) • Symbol table only stores label & address, not nature of what is stored • Instruction can still be interpreted as data, & vice versa Example Example program: this: deco this, d stop .end Output: 14592 What happened? High Level Languages & Compilers • Compilers translate high level language code into low level language; may be: – machine language – assembly language – for the latter an additional translation step is required to make the program executable C++/Java example // C++ code: #include <iostream.h> #include <string> string greeting = “Hello world”; int main () { cout << greeting << endl; return 0; } // Java code: public class Hello { static String greeting = “Hello world”; public static void main (String [] args) { System.out.print (greeting); System.out.print(‘\n’); } } Assembly language (approximate) equivalent br main greeting: .ASCII "Hello world \x00" main: stro greeting, d charo '\n', i stop .end Data types • In a high level language, such as Java or C++, variables have the following characteristics: – Name – Value – Data Type • At a lower level (assembly or machine language), a variable is just a memory location • The compiler generates a symbol table to keep track of high level language variables Symbol table entries The illustration drove shows a snippet of output from the Pep/8 assembler. Each symbol table entry includes: • the symbol • the value (of the symbol’s start address) • the type (.ASCII in this case) Pep/8 Branching instructions • We have already seen the use of BR, the unconditional branch instruction • Pep/8 also includes 8 conditional branch instructions; these are used to create assembly language control structures • These instructions are described on the next couple of slides Conditional branching instructions • BRLE: – branch on less than or equal – how it works: if N or Z is 1, PC = operand • BRLT: – branch on less than – how it works: if N is 1, PC = operand • BREQ: – branch on equal – how it works: if Z is 1, PC = operand • BRNE: – branch on not equal – how it works: if Z is 0, PC = operand Conditional branching instructions • BRGE: – branch on greater than or equal – if N is 0, PC = operand • BRGT: – branch on greater than – if N and Z are 0, PC = operand • BRV: – branch if overflow – if V is 1, PC = operand • BRC: – branch if carry – if C is 1, PC = operand Example HLL code: int num; Scanner kb = new Scanner(); System.out.print (“Enter a number: ”); num = kb.nextInt(); if (num < 0) num = -num; System.out.print(num); Pep/8 code: br main num: .block 2 prompt: .ascii "Enter a number: \x00" main: stro prompt, d deci num, d lda num, d brge endif lda num, d nega ; negate value in a sta num, d endif: deco num, d stop .end Analysis of example Pep/8 code: br main num: .block 2 prompt: .ascii "Enter a number: \x00" main: stro prompt, d deci num, d lda num, d brge endif lda num, d nega ; negate value in a sta num, d endif: deco num, d stop .end The if statement, if translated back to Java, would now be more like: if (num >= 0); else num = -num; This part requires a little more explanation; see next slide Analysis continued • A compiler must be programmed to translate assignment statements; a reasonable translation of x = 3 might be: – load a value into the accumulator – evaluate the expression – store result to variable • In the case above (and in the assembly language code on the previous page), evaluation of the expression isn’t necessary, since the initial value loaded into A is the only value involved (the second load is really the evaluation of the expression) Compiler types and efficiency • An optimizing compiler would perform the necessary source code analysis to recognize that the second load is extraneous – advantage: end product (executable code) is shorter & faster – disadvantage: takes longer to compile • So, an optimizing compiler is good for producing the end product, or a product that will be executed many times (for testing); a nonoptimizing compiler, because it does the translation quickly, is better for mid-development Another example HLL code: Pep/8 code: final int limit = 100; int num; br main limit: .equate 100 num: .block 2 high: .ascii "high\x00" low: .ascii "low\x00" prompt: .ascii "Enter a #: \x00" main: stro prompt, d deci num, d if: lda num, d cpa limit, i brlt else stro high, d br endif else: stro low, d endif: stop .end System.out.print(“Enter a #: ”); if (num >= limit) System.out.print(“high”); else System.out.print(“low”); Compare instruction: cpr where r is a register (a or x): action same as subr except difference (result) isn’t stored in the register – just sets status bits – if N or Z is 0, <= is true Writing loops in assembly language • As we have seen, an if or if/else structure in assembly language involves a comparison and then a (possible) branch forward to another section of code • A loop structure is actually more like its high level language equivalent; for a while loop, the algorithm is: – perform comparison; branch forward if condition isn’t met (loop ends) – otherwise, perform statements in loop body – perform unconditional branch back to comparison Example • The following example shows a C++ program (because this is easier to demonstrate in C++ than in Java) that performs the following algorithm: – prompt for input (of a string) – read one character – while (character != end character (‘*’)) • write out character • read next character C++ code #include <iostream.h> int main () { char ch; cout << “Enter a line of text, ending with *” << endl; cin.get (ch); while (ch != ‘*’) { cout << ch; cin.get(ch); } return 0; } ;am3ex5 br main ch: prompt: main: while: endW: .end Pep/8 code .block 1 .ascii "Enter a line of text ending with *\n\x00" stro prompt, d chari ch, d ; initial read lda 0x0000, i ; clear accumulator ldbytea ch, d ; load ch into A cpa '*', i breq endW charo ch, d chari ch, d ; read next letter br while stop Do/while loop • Post-test loop: condition test occurs after iteration • Premise of sample program: – cop is sitting at a speed trap – speeder drives by – within 2 seconds, cop starts following, going 5 meters/second faster – how far does the cop travel before catching up with the speeder? C++ version of speedtrap #include <iostream.h> #include <stdlib.h> int main() { int copDistance = 0; // cop is sitting still int speeder; // speeder's speed: entered by user int speederDistance; // distance speeder travels from cop’s position cout << "How fast is the driver going? (Enter whole #): "; cin >> speeder; speederDistance = speeder; do { copDistance += speeder + 5; speederDistance += speeder; } while (copDistance < speederDistance); cout << "Cop catches up to speeder in " << copDistance << " meters." << endl; return 0; } Pep/8 version ;speedTrap br main cop: .block 2 drvspd: .block 2 drvpos: .block 2 prompt: .ascii "How fast is the driver going? (Enter whole #): \x00" outpt1: .ascii "Cop catches up to speeder in \x00" outpt2: .ascii " meters\n\x00" main: lda 0, i sta cop, d stro prompt, d deci drvspd, d ; cin >> speeder; ldx drvspd, d ; speederDistance = speeder; stx drvpos, d Pep/8 version continued do: lda 5, i adda drvspd, d adda cop, d sta cop, d addx drvspd, d stx drvpos, d while: lda cop, d cpa drvpos, d brlt do stro outpt1, d deco cop, d stro outpt2, d stop .end ; copDistance += speeder + 5; ; speederDistance += speeder; ; while (copDistance < ; speederDistance); For loops • For loop is just a count-controlled while loop • Next example illustrates nested for loops C++ version #include <iostream.h> #include <stdlib.h> int main() { int x,y; for (x=0; x < 4; x++) { for (y = x; y > 0; y--) cout << "* "; cout << endl; } return 0; } Pep/8 version ;nestfor br main x: .word 0x0000 y: .word 0x0000 main: sta x, d stx y, d outer: adda 1, i cpa 5, i breq endo sta x, d ldx x, d inner: charo '*', i charo ' ', i subx 1, i cpx 0, i brne inner charo '\n', i br outer endo: stop .end Notes on control structures • It’s possible to create “control structures” in assembly language that don’t exist at a higher level • Your text describes such a structure, illustrated and explained on the next slide A control structure not found in nature • Condition C1 is tested; if true, branch to middle of loop (S3) • After S3 (however you happen to get there – via branch from C1 or sequentially, from S2) test C2 • If C2 is true, branch to top of loop • No way to do this in C++ or Java (at least, not without the dreaded goto statement) High level language programs vs. assembly language programs • If you’re talking about pure speed, a program in assembly language will almost always beat one that originated in a high level language • Assembly and machine language programs produced by a compiler are almost always longer and slower • So why use high level languages (besides the fact that assembly language is a pain in the patoot) Why high level languages? • Type checking: – data types sort of exist at low level, but the assembler doesn’t check your syntax to ensure you’re using them correctly – can attempt to DECO a string, for example • Encourages structured programming Structured programming • Flow of control in program is limited to nestings of if/else, switch, while, do/while and for statements • Overuse of branching instructions leads to spaghetti code Unstructured branching • Advantage: can lead to faster, smaller programs • Disadvantage: Difficult to understand – and debug – and maintain – and modify • Structured flow of control is newer idea than branching; a form of branching with gotos by another name lives on in the Java/C++ switch/case structure Evolution of structured programming • First widespread high level language was FORTRAN; it introduced a new conditional branch statement: if (expression) GOTO new location • Considered improvement over assembly language – combined CPr and BR statements • Still used opposite logic: if (expression not true) branch else // if-related statements here branch past else else: // else-related statements here destination for if branch Block-structured languages • ALGOL-60 (introduced in 1960 – hey, me too) featured first use of program blocks for selection/iteration structures • Descendants of ALGOL include C, C++, and Java Structured Programming Theorem • Any algorithm containing GOTOs can be written using only nested ifs and while loops (proven back in 1966) • In 1968, Edsgar Dijkstra wrote a famous letter to the editor of Communications of the ACM entitled “gotos considered harmful” – considered the structured programming manifesto • It turns out that structured code is less expensive to develop, debug and maintain than unstructured code – even factoring in the cost of additional memory requirements and execution time