Assembly Language – 1 1. Machine Language This is what the computer actually sees and deals with. Every command the computer sees is given as a number or sequence of numbers. 2. Assembly Language This is the same as machine language, except the command numbers have been replaced by letter sequences which are easier to memorize. Other small things are done to make it easier as well. 3. High-Level Language They are there to make programming easier. Assembly language requires you to work with the machine itself. High-level languages allow you to describe the program in a more natural language. A single command in a high-level language usually is equivalent to several commands in an assembly language. To study… - SELF-study • Syntax • Variables • Basic data movements and arithmetic instructions • Program organization – comprising code, data & stack • Assembly lang prog MUST be converted to a machine lang prog before it can be executed by an assembler 1. Syntax • Assembly lang – low-level programming language • An assembly language is specific to a certain computer architecture, in contrast to most high-level programming languages, which generally are portable to multiple systems. Assembler • Assembly language programs are converted into executable machine code by a utility program referred to as an assembler, the conversion process being referred to as assembly or assembling the program. • Assembler Microsoft Macro Assembler [MASM] Assembly lang • A program written in assembly language consists of a series of (mnemonic) processor instructions and metastatements (known variously as directives, pseudoinstructions and pseudo-ops), comments and data. • Assembly lang. instructions usually consist of an opcode mnemonic followed by a list of data, arguments or parameters. These are translated by an assembler into machine language instructions that can be loaded into memory and executed. name START: operation MOV operand(s) CX, 5 ;comment ;যা ইচ্ছা লিখ! ‘;’ লিও! Example • The instruction that tells an x86 processor to move an immediate 8-bit value into a register. • The binary code for this instruction: 10110 • followed by a 3-bit identifier for which register to use. The identifier for the AL register is 000, • so the following machine code loads the AL register with the data 01100001. 10110 000 Instruction In HEX AL B0 Instruction AL 01100001 data 61 data -Intel assembly language provides the mnemonic MOV (an abbreviation of move) for instructions such as this -so the machine code above can be written as follows in assembly language, complete with an explanatory comment if required, after the semicolon. MOV AL, 61h ; Load AL with 97 decimal (61 hex) • The Intel opcode 10110000 (B0) copies an 8-bit value into the AL register, • while 10110001 (B1) moves it into CL • 0110010 (B2) does so into DL. MOV AL, 1h ; Load AL with immediate value 1 MOV CL, 5h ; Load AL with immediate value 5 MOV DL, 3h ; Load AL with immediate value 3 MOV EAX, [EBX] ; Move the 4 bytes in memory at the address contained in EBX into EAX MOV [ESI+EAX], CL ; Move the contents of CL into the byte at address ESI+EAX • Transforming assembly language into machine code is the job of an assembler, and • The reverse can at least partially be achieved by a disassembler. • Each computer architecture has its own machine language. • Computers differ in the number and type of operations they support, in the different sizes and numbers of registers, and in the representations of data in storage. • While most general-purpose computers are able to carry out essentially the same functionality, the ways they do so differ; the corresponding assembly languages reflect these differences. .section .data • Anything starting with a period isn’t directly translated into a machine instruction. • Instead, it’s an instruction to the assembler itself. These are called assembler directives or pseudo-operations because they are handled by the assembler and are not actually run by the computer. • The .section command breaks your program up into sections. • This command starts the data section, where you list any memory storage you will need for data .section .text which starts the text section. The text section of a program is where the program instructions live. .globl _start • This instructs the assembler that _start is important to remember. • _start is a symbol, which means that it is going to be replaced by something else either during assembly or linking. • Symbols are generally used to mark locations of programs or data, so you can refer to them by name instead of by their location number _start: • It defines the value of the _start label. A label is a symbol followed by a colon. • Labels define a symbol’s value. When the assembler is assembling the program, it has to assign each data value and instruction an address. Labels tell the assembler to make the symbol’s value be wherever the next instruction or data element will be. • This way, if the actual physical location of the data or instruction changes, you don’t have to rewrite any references to it - the symbol automatically gets the new value. Next is instruction! • • • • • MOV ADD . . . Just a recap – Imp – on 8086 processor 8086 Internal Configuration Simplified block diagram over Intel 8088 (a variant of 8086); 1=main registers; 2=segment registers and IP; 3=address adder; 4=internal address bus; 5=instruction queue; 6=control unit (very simplified!); 7=bus interface; 8=internal data bus; 9=ALU; 10/11/12=external address/data/control bus