THE PC MACHINE INSTRUCTION SET Many instructions have a specific purpose, so that a 1-byte machine language instruction code is adequate. The following are examples: MACHINE CODE 40 50 C3 CB FD SYMBOLIC INSTRUCTION INC AX PUSH AX RET (short) RET (far) STD COMMENT ;Increment AX ;Push AX ;Short return from procedure ;Far return from procedure ;Set Direction Flag None of these instructions makes a direct reference to memory. Instructions that specify an immediate operand, two registers, or a reference to memory are more complex and require two or more bytes of machine code. Machine code has a special provision for indicating a particular register and another provision for referencing memory by means of an addressing mode byte. REGISTER NOTATION Instructions that reference a register may contain three bits that indicate the particular register and a w-bit that indicates whether the width is a byte (0) or a word (1). Also, only certain instructions may access the segment registers. Figure 1 shows the register notations. For example, bit value 000 means AL if the w bit is 0 and AX if it is 1. Here’s the symbolic and machine code for a MOV instruction with a one-byte immediate operand: MOV AH,00 10110 100 00000000 | ||| w reg = AH In this case, the first byte of machine code indicates a width of one byte (w = 0) and refers to AH (100). Here’s a MOV instruction that contains a one-word immediate operand, along with its generated machine code: MOV AX,00 10111 000 00000000 00000000 | ||| w reg = AX The first byte of machine code indicates a width of one word (w = 1) and refers to AX (000). For other instructions, w and reg may occupy different positions. Also, the first byte of machine code may contain a d-bit that indicates the direction (left/right) of flow. Bits for General, Base, and Index Registers Segment Registers Bits w=0 w=1 000 ES 000 AL AX/EAX 001 CS 001 CL CX/ECX 010 SS 010 DL DX/EDX 011 DS 011 BL BX/EBX 100 FS 100 AH SP 101 GS 101 CH BP 110 DH SI 111 BH DI Figure 1 Register Notation THE ADDRESSING MODE BYTE The mode byte, when present, occupies the second byte of machine code and consists of following three elements: mod A 2-bit mode, where the values 00, 01, and 10 refer to memory locations 11 refers to a register reg A 3-bit reference to a register r/m A 3-bit reference to a register or memory, where r specifies which register and m indicates a memory address In the following example of adding AX to BX ADD BX,AX 00000011 11 011 000 || || ||| ||| dw mod reg r/m d = 1 means that mod (11) and reg (011) describe the first operand and r/m (000) describes the second operand. Since w = 1, the width is a word. Therefore, the instruction is to add AX (000) to BX (011). Mod Bits. The two mod bits distinguish between addressing of registers and memory. 00 01 10 11 r/m bits give the exact addressing option; no offset byte (unless r/m = 110). r/m bits give the exact addressing option; one offset byte. r/m bits give the exact addressing option; two offset bytes. r/m specifies a register. The w-bit (in the operation code byte) determines whether a reference is to an 8-, 16-, or 32-bit register. Reg Bits. The three reg bits, in association with the w-bit, determine the actual width. R/M Bits. The three r/m (register/memory) bits, in association with the mod bits, determine the addressing mode, as shown in Figure 2. r/m mod=00 rnod=01 or 10 000 001 010 011 100 101 110 111 [BX+SI] [BX+DI] [BP+SI] [BP+DI] [SI] [DI] Direct [BX] DS: [BX+SI+disp] DS: [BX+DI+disp] SS: [BP+SI+disp] SS: [BP+DI+disp] DS: [SI+disp] DS: [DI+disp] SS: [BP+disp] DS: [BX+disp] mod=11 w=0 AL CL DL BL AH CH DH BH Figure 2 The r/m BitsTwo-Byte Instructions mod=11 w=1 AX CX DX BX SP BP SI DI The following two-byte instruction adds BX to AX: ADD AX,BX 0000 0011 11 000 011 || || ||| ||| dw mod reg r/m d=1 reg plus w describes the first operand (AX), and mod plus r/m plus w describes the second operand (BX). w = 1 The width is a word. mod = 11 The second operand is a register. reg = 000 The first operand is AX. r/m = 011 The second operand is BX. The next example multiplies AL by BL: MUL BL 11110110 11 100 011 | || ||| ||| w mod reg r/m The width (w = 0) is a byte, mod (11) references a register, and the register (r/m = 011) is BL (011). Reg = 100 is not meaningful here. The processor assumes that the multiplicand is in AL if the multiplier is a byte (as in this example), AX if a word, and EAX if a double-word. Three-Byte Instructions The following MOV generates three bytes of machine code: MOV mem-word,AX 10100011 mmmmmmmm mmmmmmmm || dw A move from the accumulator (AX or AL) needs to know only whether the operation is byte or word. In this example, w = 1 means a word, and the 16-bit AX is understood. (AL coded in the second operand would cause the w bit to be zero.) Bytes 2 and 3 contain the offset to the memory location. Using the accumulator register often generates a shorter instruction length and faster execution than the use of other registers. Four-Byte Instructions The following four-byte instruction multiplies AL by a memory locatton: MUL men—byte 11110110 00 100 110 mmmmmmmmm mmmmmmmm | || ||| ||| w mod reg r/m For this instruction, although reg is 100, the multiplicand is assumed to be AL (one byte, because w = 0). Mod = 00 indicates a memory reference, and r/m = 110 means a direct reference to memory; the two subsequent bytes provide the offset to the memory location. The next example illustrates the LEA instruction, which specifies a word address LEA DX,memory 10001101 00 010 110 mmmmmmmm mmmmmmmm | | ||| ||| mod reg r/m Reg = 010 designates DX; mod = 00 and r/m = 110 indicate a direct reference to a memory address; the next two bytes provide the offset to this location.