COS2014 Basic Concepts Department of Computer Science Faculty of Science RU. Assembly Language for IntelBased Computers, 5th Edition Kip Irvine Why study assembly language? Learn about computers Computer architecture Operating systems Data representation Hardware devices Learn about assembly languages Learn about compiling Learn how to write embedded programs Learn the assemble language for Intel 80x86 Welcome to Assembly Language: Definitions Assembly language: machine-specific language with a one-to-one correspondence with the machine language for that computer Machine language: The language a particular processor understands Assembler: converts programs from assembly language to machine language Welcome to Assembly Language: Example High-Level Language: Assembly Language: Machine language: x = a + b; MOV AX, a ADD AX, b MOV x, AX A1 0002 06 0004 A3 0000 Welcome to Assembly Language: Problems with assembly language Provides no structure Is not portable Applications can be very long “Hard” to read and understand Lots of detail required Assembly Language Machine language: Machine instructions: direct instructions to the processor, e.g. to be encoded to control the datapath A numeric language understood by the processor Assembly language: Statements (instructions) in short mnemonics and symbolic reference, e.g. ADD, MOV, CALL, var1, i, j, that have a 1-to-1 relationship with machine instructions Understood by human When you are writing a C program, how does a computer look like? CPU, memory, I/O, Operations on variables, … A Model of Computer for C Program Counter i = i + j; xfloat = 1.0; if (A[0]==0) … Memory i, j, k; && CPU if + for xfloat, yfloat; A[0], A[1], … Typed storage 8 This model is quite different from what the hardware in a computer does when you run your program! Why a C program code can run on your computer? Obviously, someone does some translation for you! We have known … C program x = (a+b) * b Assembly program C compiler MOV ADD MUL MOV AX, a AX, b c x, AX 10 We can see that Assembly Lang. is closer to real computer hardware! From the angle of Assembly Lang., how does a computer look like? A Model of Computer for ASM MOV AX, a ADD AX, b MOV x, AX … CPU PC AX BX JX Memory a 010100110010101 b 110010110001010 x 000000000010010 ... + - 12 Assembly program/machine code still have some distance to the real computer hardware. e.g. Multi-core CPU、hyperthreading We need one more level of translation! Computer Model of a Lower Layer PCSrc ID/EX 0 M u x 1 WB Control IF/ID EX/MEM M WB EX M MEM/WB WB Add Add Add result Instruction memory ALUSrc Read register 1 Read data 1 Read register 2 Registers Read Write data 2 register Write data MemtoReg Address Branch Shift left 2 MemWrite PC Instruction RegWrite 4 Zero ALU 0 M u x 1 ALU result Address Data memory Read data Write data Instruction 16 [15– 0] Instruction [20– 16] Instruction [15– 11] Sign extend 32 6 ALU control 0 M u x 1 MemRead ALUOp RegDst (from Computer Architecture textbook) 1 M u x 0 A Layered View of Computer i = i + j; if (A[0]==0) … Program Counter && CPU if i = i + j; xfloat = 1.0; if (A[0]==0) … i, j, k; xfloat, yfloat; + A[0], A[1], … for Memory ADD AX, b MOV x, AX … MOV AX, a ADD AX, b MOV x, AX … a 010100110010101 b 110010110001010 AX PC BX ... JX CPU Memory x 000000000010010 + PCSrc ID/EX 0 M u x 1 WB Control IF/ID EX/MEM M WB EX M MEM/WB WB Add Add Add result Instruction memory Branch Shift left 2 ALUSrc Read register 1 Read data 1 Read register 2 Registers Read Write data 2 register Write data MemtoReg Address MemWrite PC Instruction RegWrite 4 Zero ALU 0 M u x 1 ALU result Address Data memory Read data Write data Instruction [15– 0] Instruction [20– 16] Instruction [15– 11] 16 Sign extend 32 6 ALU control 0 M u x 1 ALUOp RegDst MemRead 1 M u x 0 xxxxxx xxxxx 15 From Assembly to Binary Assembly MOV AX, a ADD AX, b MUL c MOV x, AX Assembler 00000000101000010000000000011000 00000000100011100001100000100001 10001100011000100000000000000000 10001100111100100000000000000100 10101100111100100000000000000000 10101100011000100000000000000100 00000011111000000000000000001000 Machine code Different Levels of Abstractions temp = v[k]; v[k] = v[k+1]; High Level Language Program v[k+1] = temp; Compiler MOV ADD MOV Assembly Language Program Assembler Machine Language Program Machine Interpretation Control Signal ° ° 0000 1010 1100 0101 AX, AX, x, a b AX more elaborated 1001 0110 1010 later1100 in Sec. 1-2:1111 1111 0101 1000 0000 1001 Virtual Machine 0110 1010 1111 0101 1000 1000 0000 1001 1100 0110 Concept 0101 1100 0000 1010 1000 0110 1001 1111 ALUOP[0:3] <= InstReg[9:11] & MASK What’s Next? Virtual machine concept (Sec. 1-2) Data representation (Sec. 1-3) 18 Virtual Machine Concept Purpose of this section: Understand the role of assembly language in a computer system Side product: The principle of layered abstraction for combating complexities, e.g. OSI 7-layer protocol 19 Virtual Machine Concept A layered abstraction of computers proposed by A. Tanenbaum Each layer provides an abstract computer, or virtual machine, to its upper layer Virtual machine: A hypothetical computer that can be constructed of either HW or SW What is a computer? High-Level Language Level 5 Assembly Language Level 4 Operating System Level 3 Instruction Set Architecture Level 2 Microarchitecture Level 1 Digital Logic Level 0 20 Simplest Model of Computers Instructions Program Input data Compute engine Memory Output data c.f., y = f(x) Layered abstraction: A computer consists of layers of such virtual machine abstractions 21 Why Layered Abstraction? Big idea: layered abstraction to combat complexities A strategy of divide-and-conquer Decompose a complex system into layers with well-defined interfaces Each layer is easier to manage and handle Only need to focus on a particular layer, e.g. to make it the best Also, it makes interaction clear Particularly if one layer is realized in hardware and the other in software 22 Layered Abstraction of Computer Each layer as a hypothetical computer, or virtual machine, that runs a programming language Can be programmed with the programming language to process inputs and outputs Instructions Compute engine Program written in Li can be mapped to that Li-1 by: Program Input data Memory Output data Interpretation: Li-1 program interprets and executes Li instructions one by one Translation: Li program is completely translated into Li-1 program, and runs on Li-1 machine 23 Layered Abstraction of Computer Program Counter && CPU if Memory i = i + j; xfloat = 1.0; if (A[0]==0) … i, j, k; xfloat, yfloat; + A[0], A[1], … for Memory ADD AX, b MOV x, AX … b 110010110001010 x 000000000010010 + - Li MOV AX, a ADD AX, b MOV x, AX … a 010100110010101 AX PC BX ... JX CPU Virtual Machine i = i + j; if (A[0]==0) … Li-1 PCSrc ID/EX 0 M u x 1 WB Control IF/ID EX/MEM M WB EX M MEM/WB WB Add Add Add result Instruction memory Branch Shift left 2 ALUSrc Read register 1 Read data 1 Read register 2 Registers Read Write data 2 register Write data MemtoReg Address MemWrite PC Instruction RegWrite 4 Zero ALU 0 M u x 1 ALU result Address Data memory Read data Write data Instruction [15– 0] Instruction [20– 16] Instruction [15– 11] 16 Sign extend 32 6 ALU control 0 M u x 1 ALUOp RegDst MemRead 1 M u x 0 xxxxxx xxxxx 24 Languages of Different Layers English: Display the sum of A times B plus C. C++: cout << (A * B + C); Assembly Language: mov eax,A mul B add eax,C call WriteInt Intel Machine Language: A1 00000000 F7 25 00000004 03 05 00000008 E8 00500000 25 High-Level Language Level 5 Application-oriented languages, e.g., C, C++, Java, Perl Written with certain programming model in mind Variables in storage Operators for operations Programs compiled into assembly language (Level 4) or interpreted by interpreters What kind of computer does C see? 26 Assembly Language Level 4 Instruction mnemonics that have a one-to-one correspondence to machine language Based on a view of machine: register organization, addressing, operand types and locations, functional units, … High-Level Language Calls functions written at the Assembly Language OS level (Level 3) Programs are translated into Operating System machine language (Level 2) Instruction Set What kind of computer does it see? Level 5 Level 4 Level 3 Architecture Level 2 Microarchitecture Level 1 Digital Logic Level 0 27 Operating System Level 3 Provides services to Level 4 programs as if it were a computer Programs translated and run at the instruction set architecture level (Level 2) 28 Instruction Set Architecture Level 2 Known as conventional machine language Attributes of a computer as seen by assembly programmer, i.e. conceptual structure and functional behavior Organization of programmable storage Data types and data structures Instruction set and formats Addressing modes and data accessing Executed by Level 1 program (microarchitecture) 29 Microarchitecture Level 1 Can be described by register transfer language (RTL) Interprets conventional machine instructions (Level 2) Executed by digital hardware (Level 0) Register Control Signals Controller clock Memory N Z IR ALU PC 30 Digital Logic Level 0 CPU, constructed from digital logic gates System bus Memory 31 What’s Next? Virtual machine concept (Sec. 1-2) Data representation (Sec. 1-3) 32 Data Representation Purpose of this section Assembly program often needs to process data, and manage data storage and memory locations need to know data representation and storage Binary numbers: translating between binary and decimal Binary addition Integer storage sizes Hexadecimal integers: translating between decimal and hex.; hex. subtraction Signed integers: binary subtraction Character storage 33 Binary Numbers Digits are 1 and 0 1 = true 0 = false MSB – most significant bit LSB – least significant bit Bit numbering: MSB LSB 1011001010011100 15 0 Binary Numbers Each digit (bit) is either 1 or 0 Each bit represents a power of 2: Every binary number is a sum of powers of 2 1 1 1 1 1 1 1 1 27 26 25 24 23 22 21 20 Integer Storage Sizes Standard sizes: byte word doubleword 8 16 32 quadword 64 Why unsigned numbers? What is the largest unsigned integer that may be stored in 20 bits? 36 Signed Integers The highest bit indicates the sign 1 = negative, 0 = positive sign bit 1 1 1 1 0 1 1 0 0 0 0 0 1 0 1 0 Negative Positive If the highest digit of a hexadecimal integer is > 7, the value is negative. Examples: 8A, C5, A2, 9D 37 Forming Two's Complement Negative numbers are stored in two's complement notation Complement (reverse) each bit Add 1 Why? Note that 00000001 + 11111111 = 00000000 38 Binary Addition Starting with the LSB, add each pair of digits, include the carry if present. + bit position: carry: 1 0 0 0 0 0 1 0 0 (4) 0 0 0 0 0 1 1 1 (7) 0 0 0 0 1 0 1 1 (11) 7 6 5 4 3 2 1 0 Hexadecimal Integers All values in memory are stored in binary. Because long binary numbers are hard to read, we use hexadecimal representation. Translating Binary to Hexadecimal • Each hexadecimal digit corresponds to 4 binary bits. • Example: Translate the binary integer 000101101010011110010100 to hexadecimal: Converting Hexadecimal to Decimal Multiply each digit by its corresponding power of 16: dec = (D3 163) + (D2 162) + (D1 161) + (D0 160) Hex 1234 equals (1 163) + (2 162) + (3 161) + (4 160), or decimal 4,660. Hex 3BA4 equals (3 163) + (11 * 162) + (10 161) + (4 160), or decimal 15,268. Data representation: Number systems (bases) Number systems used Binary: The internal representation inside the computer. Externally, they may be represented in binary, decimal, or hexadecimal. Decimal: The system people use. ASCII representations of numbers used for I/O: ASCII binary ASCII octal ASCII decimal ASCII hexadecimal Data representation: Hex Addition & Multiplication Hex addition and multiplication tables are large We can still do simple calculations by hand B852h 23Ah + 5A65h * 100h (Your turn) ABCh + EF0h 2B3h * 102h Data representation: Converting to decimal 12345 = 1 * 104 + 2 * 103 + 3 * 102 + 4 * 101 + 5*100 (Human) conversions: hex to decimal ABCDh = 10*163 + 11*162+ 12 *161 + 13 *160 = 10*4096 + 11*256 + 12*16 + 13 = 40960 + 2816 + 192 + 13 = 43981 Data representation: Converting to decimal (Human) conversions to decimal ABCDh = (((10*16+11)*16+12)*16+13 = 43981 (easier on calculator) Data representation: Your Turn: Conversion problems 111010b = ________ 10 1234 base 5 or (1234)5= _________ 10 Data representation: Conversion from decimal (Human) conversions from decimal 274810 = ??? In hex 2748 = 171 * 16 + 12 171 = 10 * 16 + 11 10 = 0 * 16 + 10 so value is ABCh How do we know this is the Hex representation? 2748 = = = = 171 * 16 +12 (10*16 + 11) * 16 + 12 10 * 162 + 11 * 161 + 12 * 160 ABCh Data representation: Your Turn: Conversion problems Write decimal 58 in binary. Write decimal 194 in base 5 Learn How To Do the Following: Form the two's complement of a hexadecimal integer Convert signed binary to decimal Convert signed decimal to binary Convert signed decimal to hexadecimal Convert signed hexadecimal to decimal Ranges of Signed Integers The highest bit is reserved for the sign. This limits the range: Practice: What is the largest positive value that may be stored in 20 bits? Character Storage Character sets Standard ASCII (0 – 127) Extended ASCII (0 – 255) ANSI (0 – 255) Unicode (0 – 65,535) Null-terminated String Array of characters followed by a null byte Using the ASCII table back inside cover of book Numeric Data Representation pure binary can be calculated directly ASCII binary string of digits: "01010101" ASCII decimal string of digits: "65" ASCII hexadecimal string of digits: "9C" Character Representation ASCII (Table of ASCII Codes) American Standard Code for Information Interchange Standard encoding scheme used to represent characters in binary format on computers 7-bit encoding, so 128 characters can be represented 0 to 31 & 127 are "control characters" (cannot print) ‒ Ctrl-A or ^A is 1, ^B is 2, etc. ‒ Used for screen formatting & data communication 32 to 126 are printable (see the last page in textbook) ASCII Character Codes CHAR DECIMAL HEX BINARY '0' '9' 48d 57d 30h 39h 0011 0000b 0011 1001b 'A' 'Z' 65d 90d 41h 5Ah 0100 0001b 0101 1010b 'a' 'z' 97d 122d 61h 7Ah 0110 0001b 0111 1010b Binary Data Decimal, hex & character representations are easier for humans to understand; however… All the data in the computer is binary An int is typically 32 binary digits int y = 5; (y = 0x00000005;) ‒ In computer y = 00000000 00000000 00000000 00000101 int z = -5; (y = 0xFFFFFFFB;) ‒ In computer, z = 11111111 11111111 11111111 11111011 Binary Data A char is typically 8 binary digits char x = '5'; (or char x = 53, or char x = 0x35) ‒ In computer, x = 00110101 char x = 5; (or char x = 0x05;) ‒ In computer, x = 00000101 Note that the ASCII character 5 has a different binary value than the numeral 5 Also note that 1 ASCII character = 2 hex numbers Storage Size Terminology Byte Word 16 bits (2 bytes) Doubleword 8 bits (basic storage size for all data) 32 bits (4 bytes) Quadword 64 bits (8 bytes)