CMPE 325 Computer Architecture II Cem Ergün Eastern Mediterranean University Using Assembly Using Arrays for Counting Consider the C code for counting an array where we have int target, int n, and int *list available in parameters $a0-$a2 int count = 0; int i; for (i = 0; i < n; i++) { if (list[i] == target) count++; } CMPE325 CH #3 Slide #2 Using Arrays Solution Writing the loop Loop: Next: li $t0, 0 li $t1, 0 bge $t1, $a1, Exit add $t2, $t1, $t1 add $t2, $t2, $t2 add $t3, $t2, $a2 lw $t4, 0($t3) bne $t4, $a0, Next addi $t0, $t0, 1 addi $t1, $t1, 1 j Loop # # # # # # # # # # # count = 0 i = 0 goto Exit if i >= n $t2 = 2 * i $t2 = 4 * i $t3 = list + 4 * i $t4 = list[i] goto Next if $t4!=target count++; i++; Loop again Exit: CMPE325 CH #3 Slide #3 MIPS Assembler Directives SPIM supports a subset of the MIPS assembler directives Some of the directives include: .asciiz – Store a null-terminated string in memory .data – Start of data segment .global – Identify an exported symbol .text – Start of text segment .word – Store words in memory See Appendix A for details and examples CMPE325 CH #3 Slide #4 Representing Instructions High-level Assembly Machine .c C Program Compiler .s Assembly Program Assembler .o Machine Object Module Object Linker Executable Loader Memory CMPE325 CH #3 Slide #5 Assembler Expands macros and pseudoinstructions as well as converts values (ex. 0xFF for hex) Primary purpose is to produce object file containing Machine language instructions Application data Information for memory organization CMPE325 CH #3 Slide #6 Object File Includes Object header – describes file organization Text segment – machine code Data segment – static and dynamic data Relocation information – identifies instructions/data that depend on absolute addresses when program is loaded Symbol table – list of labels that are not defined (ex. external references) Debugging information – describes relationship between source code and machine instructions CMPE325 CH #3 Slide #7 Linker Linker combines multiple object modules Identify where code/data will be placed in memory Resolve code/data cross references Produces executable if all references found Steps 1. 2. 3. Place code and data modules in memory Determine the address of data and instruction labels Patch both the internal and external references Separation between compiler and linker makes standard libraries an efficient solution to maintaining modular code CMPE325 CH #3 Slide #8 Loader Loader used at run-time 1. 2. 3. 4. 5. 6. 7. Reads executable file header for size of text/data segments Create address space sufficiently large Copy instructions and data from executable into memory Copy parameters to main program’s stack Initialize machine registers and set SP Jump to start-up routine Makes exit system call when program is done CMPE325 CH #3 Slide #9 Instruction Encoding As we have seen, there are several different ways that instructions are written, depending upon what types of information they need MIPS architecture has three instruction formats, all 32 bits in length A 6 bit opcode appears at the beginning of each instruction Regularity is simpler and improves performance Needed by control logic to be able to decode instruction type See Appendix A.10 and Page 153 for a list CMPE325 CH #3 Slide #10 Machine Language All instructions have the same length (32 bits) DP3: Good design demands good compromises Same length or same format Three different formats R: arithmetic instruction format I: transfer, branch, immediate format J: jump instruction format add $t0, $s1, $s2 10101101001010000000010010110000 32 bits in machine language 00000010010010000100000000100000 10001101001010000000010010110000 Fields for: lw $t0, 1200($t1) Operation (add) $t0, $s2, $t0 • Operands ($s1, $s2, $t0) add sw $t0, 1200($t1) A[300] = h + A[300]; CMPE325 CH #3 Slide #11 Instruction Formats 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits R: op rs rt rd shamt funct I: op rs rt J: op address / immediate target address op: basic operation of the instruction (opcode) rs: first source operand register rt: second source operand register rd: destination operand register shamt: shift amount funct: selects the specific variant of the opcode (function code) address: offset for load/store instructions (+/-215) immediate: constants for immediate instructions CMPE325 CH #3 Slide #12 Example A[300] = h + A[300]; /* $t1 <= base of array A; $s2 <= h */ Compiler lw $t0, 1200($t1) add $t0, $s2, $t0 sw $t0, 1200($t1) # temporary register $t0 gets A[300] # temporary register $t0 gets h +A[300] # stores h + A[300] back into A[300] Assembler 35 0 43 9 18 9 8 8 8 100011 000000 101011 01001 10010 01001 01000 01000 01000 8 1200 0 1200 32 0000 0100 1011 0000 01000 00000 100000 0000 0100 1011 0000 CMPE325 CH #3 Slide #13 R-Format Used by ALU instructions Uses three registers: one for destination and two for source Bits 6 5 OP=0 rs 5 rt 5 5 6 rd sa funct First Second Result Shift Function Source Source Register Amount Code Register Register (Chap 4) Function code specifies which operation CMPE325 CH #3 Slide #14 R-Format Example Consider the add instruction add $8, $17, $18 Bits We can fill in each of the fields 6 5 OP=0 17 5 18 5 8 5 6 0 32 First Second Result Shift Function Source Source Register Amount Code Register Register (Chap 4) 000000 10001 10010 01000 CMPE325 CH #3 00000 100000 Slide #15 R-Format Limitations The R-Format works well for ALU-type operations, but does not work well for some of the other instructions we have seen Consider for example the lw instruction which takes an offset If placed in an R-format, would only have 5 bits of space for the offset Offsets of only 32 are not all that useful! A good design requires good compromises, so a single instruction format is not possible CMPE325 CH #3 Slide #16 Immediates (Numerical Constants) Small constants are used frequently (50% of operands) A = A + 5; C = C – 1; Solutions Put typical constants in memory and load them Create hardwired registers (e.g. $0 or $zero) Rule4: make the common case fast MIPS instructions for constants (I format) addi $t0, $s7, 4 # $t0 = $s7 + 4 8 001000 23 10111 8 01000 44 0000 0000 0000 0100 CMPE325 CH #3 Slide #17 I-Format The immediate instruction format Bits Uses different opcodes for each instruction Immediate field is signed (positive/negative) Used for loads and stores as well as immediate instructions (addi, lui, etc.) Also used for branches since branch destination is PC relative 6 5 OP rs 5 rt First Second Source Source Register Register CMPE325 CH #3 16 imm Immediate Slide #18 I-Format Example Consider the addi instruction addi $8, $9, 1 # $t0 = $t1 + 1 Fill in each of the fields Bits 6 5 8 9 5 16 1 8 Immediate First Second Source Source Register Register 001000 01001 01000 0000000000000001 CMPE325 CH #3 Slide #19 Another I-Format Example Consider the while loop from before Loop: add $t0, $s0, $s0 add $t0, $t0, $t0 add $t1, $t0, $s3 lw $t2, 0($t1) bne $t2, $s2, Exit add $s0, $s0, $s1 j Loop # # # # # # # $t0 = 2 * i $t0 = 4 * i $t1 = &(A[i]) $t2 = A[i] goto Exit if != i = i + j goto Loop Exit: Pretend the first instruction is located at address 80000 CMPE325 CH #3 Slide #20 I-Format Example (Incorrect) Consider the bne instruction bne $t2, $s2, Exit # goto Exit if $t0 != $S5 Fill in each of the fields Bits 6 5 5 10 5 16 8 18 Immediate First Second Source Source Register Register 000101 01010 10010 0000000000001000 This is not the optimum encoding CMPE325 CH #3 Slide #21 PC Relative Addressing What can we improve about our use of immediate addresses when branching? Since instructions are always 32 bits long, and since addressing is word aligned, we know that every address must be a multiple of 4 Therefore, we actually branch to the address that is PC + 4 + 4 immediate CMPE325 CH #3 Slide #22 PC Relative Addressing byte addr. 0000 Branch instructions use PC-relative Addressing. A−217: Target is the label of address (B) in instruction memory. Memory lw $t3,8($s4) and $s0,$s1,$t0 −k A−4: beq $s0,$s1,B A: addi $s3,$s3,5 PC-relative byte address = B – A Target-Address = B= PC + 4×Imm16 PC contains address of the next instruction = A B: sub $t0,$s1,$t1 A+217-1: slti $at,$s1,$t0 16-bit signed Immediate word address relative to next instruction = k = (B–A)/4 PC relative word addr. Farthest backward branch 215 address FFFF CMPE325 CH #3 −1 0 k 215-1 negative imm16 = −k ref. point is next instruction positive imm16 =k =(B−A)/4 Farthest forward branch address Slide #23 I-Format Example (Corrected) Re-consider the bne instruction bne $t2, $s2, Exit # goto Exit if $t0 != $S5 Use PC-Relative addressing for the immediate Bits 6 5 5 10 5 16 2 18 Immediate First Second Source Source Register Register 000101 01010 10010 0000000000000010 CMPE325 CH #3 Slide #24 Branching Far Away If the target is > 216 away, then the compiler inverts the condition and inserts an unconditional jump Consider the example where L1 is far away beq $s0, $s1, L1 # goto L1 if S$0=$s1 Can be rewritten as bne $s0, $s1, L2 j L1 # Inverted # Unconditional jump L2: CMPE325 CH #3 Slide #25 Far Target Address Text Segment (252MB) 0x00400000 (0x07fe0000) -217 PC (0x08000000) beq $s0, $s1, L1 +217 (0x08020000) bne $s0, $s1, L2 j L1 (0x08200000) L1: L2: 0x10000000 CMPE325 CH #3 Slide #26 I-Format Example: Load/Store Consider the lw instruction lw $t2, 0($t1) # $t2 = Mem[$t1] Fill in each of the fields Bits 6 5 35 9 5 16 0 10 Immediate First Second Source Source Register Register 001000 01001 01010 0000000000000000 CMPE325 CH #3 Slide #27 Direct Memory Addressing When loading/storing, sometimes it is necessary to address a full 32 bits Many options, including: Use a 32 bit constant already stored in a register lw $t1, 0($t0) # Load using register $t0 Load an address constant from a table in memory lw $t0, 40($s0) lw $t1, 0($t0) # Load the 32 bit address # Load contents at address CMPE325 CH #3 Slide #28 J-Format The jump instruction format Bits Uses different opcodes for each instruction Used by j and jal instructions Uses absolute addressing since long jumps are common Uses word addressing as well (target 4) Pseudodirect addressing where 228 bits from target, and remaining 4 bits come from upper bits of PC 6 26 OP target Jump Target Address CMPE325 CH #3 Slide #29 J-Format 2 imm26 Address-to-Jump = Page-Address+4×imm26 = (PC31, PC30, PC29, PC28, I25, I24,....., I1, I0, 0 , 0)two Memory Page Address. Leftmost-4-bits of the Program Counter. 26-bit Immediate word- address. CMPE325 CH #3 shift-left 2-bit to convert the wordaddress to the byteaddress. Slide #30 Complete Example Now we can write the complete example for our while loop 80000 0 16 16 8 0 32 80004 0 8 8 8 0 32 80008 80012 0 8 19 9 0 32 35 5 9 10 10 18 0 16 17 80016 80020 80024 80028 … 2 0 2 16 0 32 20000 CMPE325 CH #3 Slide #31 SPIM Code PC MIPS Pseudo MIPS main [0x00400020] add $9, $10, $11 [0x00400024] j [0x00400028] addi $9, $10, -50 addi $t1, $t2, -50 [0x0040002c] lw $8, 5($9) lw $t0, 5($t1) [0x00400030] lw $8, -5($9) lw $t0, -5($t1) [0x00400034] bne $8, $9, 4 [exit-PC] [0x00400038] addi $9, $10, 50 addi $t1, $t2, 50 [0x0040003c] bne $8, $9, -8 [main-PC] #(20-40)=-20H=-32/4 bne $t0, $t1, main [0x00400040] lb $8, -5($9) lb $t0, -5($t1) [0x00400044] exit [0x00400048] j 0x00400020 / 4 [main-PC] j main main: 0x00400048 / 4 [exit] add $t1, $t2, $t3 j bne $t0, $t1, exit #(48-38)=10H=16/4 add $9, $10, $11 exit: CMPE325 CH #3 exit add $t1, $t2, $t3 Slide #32 Addressing Modes 1. Immediate addressing op rs rt Immediate 2. Register addressing op rs rt rd ... funct Registers Register 3. Base addressing op rs rt Memory Address + Register 4. PC-relative addressing op rs rt Byte Halfword Word Address Memory *4 PC + Word 5. Pseudodirect addressing op Address *4 Memory Word PC CMPE325 CH #3 Slide #33 Addressing Modes 1- Register Addressing A register address field is always 5-bit jr $31 0 31 0 0 0 8 5-bit register address register contains address 32-bit address Memory add $3, $8,$9 0 8 9 5-bit register 5-bit register address address Register contains operand1 3 0 32 5-bit register address Register takes the result Register contains operand2 CMPE325 CH #3 Slide #34 Addressing Modes 2- Base&Displacement Addressing 43 sw $5, 300 ($7) 16-bit imm 7 5 300 5 5-bit register address Base register 16 sign-extend 32 32 32-bit base address I15 .. I0 immediate-value (16-bit) I15 ..I15 , I15 .. I0 + (32-bit) byte-address CMPE325 CH #3 memory Slide #35 Addressing Modes 3- Immediate addressing immediate arithmetic-logic instructions (addi, andi, ori, slti, lui ) 16-bit imm 8 7 5 addi and slti use sign-extend 5 300 5-bit register address register contains data 32 All logical instructions use zero-extend 16 I15 .. I0 immediate-value (16-bit) sign or zero extend 32 I15 ..I15 , I15 .. I0 32-bit register contents CMPE325 CH #3 + (32-bit) result goes to $rt Slide #36 Addressing Modes 4- PC-relative addressing beq and bne use PC-relative addressing ops rs rt immediate-value (16-bit) 16 I15 .. I0 (16-bit PC-relative-wordshift-left-2 and sign- address) extend 32 I15 .. I15, I15 .. I0 , 0, 0 = (2-bit shifted and sign-extended immediate) Program Counter register 32 + byte-address CMPE325 CH #3 Target PC Slide #37 Addressing Modes 5- Pseudo-Direct addressing An operand may contain large part of the address directly. J and JAL has 26-bit immJ field as direct address. This field is left shifted-2-bit, and then PC-extended. opc Program-Counter register immediate value (26-bit) 26 I25 .. I0 (26-bit) (local-word-address) shift-left-2-bits 28 I , .. , I , 0, 0 = imm. value 22 (28-bit) 25 0 PC31 .. PC28 , (4-bit) (current-page offset) 4 (local-byte-address) Concatenate 32 byte address PC31 .. PC28 , I25 , .. , I0 , 0, 0 PC jump 28-bit byte address is PC-extended to 32-bit. CMPE325 CH #3 Slide #38 Addressing Modes Summary Register addressing – operand is a register (ex. ALU) Base/displacement addressing – operand is at the memory location that is the sum of a base register and a constant (ex. load/store) Immediate addressing – operand is a constant within the instruction itself (ex. constants) PC-relative addressing – address is the sum of PC and constant in instruction (ex. branch) Pseudodirect addressing – target address is concatenation of field in instruction and the PC (ex. jump) CMPE325 CH #3 Slide #39 Four Design Principles 1. 2. 3. 4. Simplicity favors regularity Smaller is faster Good design demands good compromises Make the common case fast CMPE325 CH #3 Slide #40 CMPE325 CH #3 Slide #41