Five classic components I am like a control tower CS/COE0447: Computer Organization and Assembly Language Chapter 2 I am like a pack of file folders I exchange information with outside world I am like a conveyor belt + service stations Sangyeun Cho Dept. of Computer Science University of Pittsburgh CS/CoE0447: Computer Organization and Assembly Language University of Pittsburgh 2 MIPS operations and operands Operation specifies what function to perform by the instruction O Operand d specifies ifi with ith what h t a specific ifi function f ti is i to t be b performed f d by b the instruction MIPS operations p • • • • • • • MIPS arithmetic All arithmetic instructions have 3 operands • Two source registers and one target or destination register Examples • add $s1, $s2, $s3 • sub $s4, $s5, $s6 Registers Fixed registers (e.g., HI/LO) Memory location Immediate value CS/CoE0447: Computer Organization and Assembly Language <op> p <rtarget> <rsource1> <rsource2> • Operand O d order d iin notation t ti iis fi fixed; d ttargett fi firstt Arithmetic (integer/floating-point) Logical Shift Compare Load/store Branch/jump System control and coprocessor MIPS operands • • • • CS/CoE0447: Computer Organization and Assembly Language University of Pittsburgh 3 # $s1 ⇐ $s2 + $s3 # $s4 ⇐ $s5 – $s6 University of Pittsburgh 4 MIPS registers $zero $at $v0 $v1 $a0 $a1 $a2 $a3 $t0 $t1 $t2 $t3 $t4 $ $t5 $t6 $t7 General-purpose General purpose registers (GPRs) 32 bits 32 bits r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 3 r14 r15 r16 r17 r18 r19 r20 r21 r22 r23 r24 r25 r26 r27 r28 r29 29 r30 r31 32 bits $s0 $s1 $s2 $s3 $s4 $s5 $s6 $s7 $t8 $t9 $k0 $k1 $gp $ $sp $fp $ra General-Purpose p Registers g The name GPR implies that all these registers can be used as operands in instructions Still,, conventions and limitations exist to keep p GPRs from being used arbitrarily HI LO • $0, termed $zero, always has a value of “0” • $31, termed $ra (return address), is reserved for storing the return address dd ffor subroutine b call/return ll • Register usage and related software conventions are typically summarized in “application binary interface” (ABI) – important when writing g system y software such as an assembler or a compiler p PC CS/CoE0447: Computer Organization and Assembly Language 32 GPRs in MIPS • Are they sufficient? Special-Purpose p p Registers g CS/CoE0447: Computer Organization and Assembly Language University of Pittsburgh University of Pittsburgh 5 Special-purpose Special purpose registers 6 Instruction encoding HI/LO registers g are used for storing g result from multiplication p operations Instructions are encoded in binary numbers • Assembler translates assembly programs into binary numbers • Machine (processor) decodes binary numbers to figure out what the original instruction is PC (program counter) • MIPS has a fixed, fixed 32 32-bit bit instruction encoding • Always keeps the pointer to the current program execution point; instruction fetching occurs at the address in PC • Not directly visible and manipulated by programmers in MIPS Other architectures Encoding should be done in a way that decoding is easy MIPS instruction formats • • • • • May not have HI/LO; use GPRs to store the result of multiplication • May allow storing to PC to make a jump CS/CoE0447: Computer Organization and Assembly Language R-format: arithmetic instructions I-format: data transfer/arithmetic/jump instructions J-format: jump instruction format (FI-/FR-format: floating-point instruction format) CS/CoE0447: Computer Organization and Assembly Language University of Pittsburgh 7 University of Pittsburgh 8 MIPS instruction formats Name Fields Instruction encoding example Comments Field Size 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits All MIPS instructions 32 bits R-format op rs rt rd shamt funct Arithmetic/logic instruction format I-format J-format op rs rt op address/immediate target address Data transfer, branch, immediate format Instruction Format op rs rt rd shamt funct immediate add R 000000 reg reg reg 00000 100000 NA sub R 000000 reg reg reg 00000 100010 NA addi (add immediate) I 00 000 001000 reg reg NA NA NA constant lw (load word) I 100011 reg reg NA NA NA address offset sw (store word) I 101011 reg reg NA NA NA address offset Jump instruction format CS/CoE0447: Computer Organization and Assembly Language CS/CoE0447: Computer Organization and Assembly Language University of Pittsburgh University of Pittsburgh 9 Logic instructions Name R-format Dealing with immediate Fields op rs rt rd shamt 10 funct Comments Name Logic instruction format I-format Bit-wise logic operations <op> <rtarget> <rsource1> <rsource2> E Examples l • nor $s3, $t2, $s1 • xor $s3, $s2, $s1 rs rt 16-bit immediate T n fe branch, Transfer, b n h immediate immedi te format Many operations involve small “immediate” value Some frequently freq entl used sed arithmetic/logic instruction instr ction shave sha e their “immediate” version • • • • # $t3 ⇐ $t2 | $t1 # $s3 ⇐ ~($t2 | $s1) # $s3 ⇐ $s2 ^ $s1 CS/CoE0447: Computer Organization and Assembly Language op Comments • a=a+1 • b=b–4 • c = d | 0x04 • and $s3, $s2, $s1 # $s3 ⇐ $s2 & $s1 • or $t3, $t2, $t1 Fields addi $s3, $s2, 16 andi $s5, $s4, 0xfffe ori $t4, $t4 $t4 $t4, 4 xori $s7, $s6, 16 # $s3 ⇐ $s2 + 16 # $s5 ⇐ $s4 & 1111111111111110b # $t4 ⇐ $t4 | 0000000000000100b # $s7 ⇐ $s6 ^ 0000000000010000b There is no “subi”; why? CS/CoE0447: Computer Organization and Assembly Language University of Pittsburgh 11 University of Pittsburgh 12 Handling long immediate number Memory transfer instructions Sometimes we need a long g immediate value, e.g., g 32 bits Also called memory access instructions Only two types of instructions • Do we need an instruction to load a 32-bit constant value to a register? MIPS requires q that we use two instructions • Load: move data from memory y to register g e.g., lw $s5, 4($t6) • lui $s3, 1010101001010101b $s3 e.g., sw $s7, 16($t3) 1010101001010101 0000000000000000 Then we fill the low-order 16 bits • ori $s3, $s3, 1100110000110011b $s3 # memory[$t3+16] ⇐ $s7 In MIPS (32-bit architecture) there are memory transfer instructions for • 32-bit word: “int” type in C • 16-bit half-word: “short” type in C • 8-bit byte: “char” type in C 1010101001010101 1100110000110011 CS/CoE0447: Computer Organization and Assembly Language # $s5 ⇐ memory[$t6 + 4] • Store: move data from register to memory CS/CoE0447: Computer Organization and Assembly Language University of Pittsburgh University of Pittsburgh 13 Address calculation 14 Machine code example Memoryy address is specified p with a (register, g constant) p pair • Register to keep the base address • Constant field to keep the “offset” from the base address void swap(int v[], int k) { int temp; temp = v[k]; v[k] = v[k+1]; v[k+1] = temp; } • Address is,, then,, ((register g + offset)) • The offset can be positive or negative MIPS uses this simple address calculation method; other architectures such as PowerPC and x86 support different methods CS/CoE0447: Computer Organization and Assembly Language CS/CoE0447: Computer Organization and Assembly Language University of Pittsburgh 15 swap: sll add lw lw sw sw jr $t0, $t1, $t3, $t4, $t4 $t4, $t3, $ra $a1, 2 $a0, $t0 0($t1) 4($t1) 0($t1) 4($t1) University of Pittsburgh 16 Memory view 0 BYTE #0 1 BYTE #1 2 BYTE #2 3 BYTE #3 4 BYTE #4 5 BYTE #5 32-bit byte address • 232 bytes b t with ith byte b t addresses dd from f 0 tto 232 – 1 • 230 words with byte addresses 0, 4, 8, …, 232 – 4 Words are aligned • 2 least significant bits (LSBs) of an address are 0s 0 WORD 4 WORD 8 WORD 12 WORD 16 WORD 20 WORD Half words are aligned • LSB of an address is 0 MSB Addressing within a word 0 the last? • Big-endian vs. little-endian 0 • Which byte appears first and which byte … Memoryy is a large, g single-dimension g 8-bit (byte) y arrayy with an address to each 8-bit item (“byte address”) A memory address is an index into the array … Memory organization CS/CoE0447: Computer Organization and Assembly Language 0 2 2 1 MSB 3 CS/CoE0447: Computer Organization and Assembly Language University of Pittsburgh LSB 1 3 LSB 0 University of Pittsburgh 17 More on alignment A misaligned access • lw $s4, 3($t0) More on alignment How do we define a word at address? • Data in byte 0, 1, 2, 3 0 0 1 4 4 5 6 7 8 8 9 10 11 If you meant this, use the address 0, not 3 • Data in byte 3, 4, 5, 6 2 int main() { int A[100]; int i; int *ptr; 3 for (i = 0; i < 100; i++) A[i] = i; … 18 printf("address of A[0] = %8x\n", &A[1]); printf("address of A[50] = %8x\n", &A[50]); ptr t = &A[50]; &A[50] If you meant this, it is indeed misaligned! Certain hardware implementation may support this; usually not If you still want to obtain a word starting from the address 3 – get a byte from address 3, a word from address 4 and manipulate the two data to get what you want printf("address in ptr = %8x\n", ptr); printf("value pointed by ptr = %d\n", *ptr); } address of A[0] = bffff2b4 address of A[50] = bffff378 address in ptr = bffff378 value pointed by ptr = 50 address in ptr = bffff379 Segmentation fault ptr = (int*)((unsigned int)ptr + 1); printf("address f( dd in ptr = %8x\n", ptr); ) printf("value pointed by ptr = %d\n", *ptr); Alignment issue does not exist for byte access CS/CoE0447: Computer Organization and Assembly Language CS/CoE0447: Computer Organization and Assembly Language University of Pittsburgh 19 University of Pittsburgh 20 Shift instructions Name R-format Control Fields NOT USED op rt Comments rd shamt funct Instruction that potentially changes the flow of program execution MIPS conditional branch instructions shamt is “shift amount” Bits change their positions inside a word <op> <rtarget> <rsource> <shift_amount> • bne $s4, $s3, LABEL • beq $s4, $s3, LABEL Examples if (i == h) h =i+j; # $s3 ⇐ $s4 << 4 # $s6 ⇐ $s5 >> 6 • sll $s3, $s4, 4 • srl $s6, $s5, 6 Example YES Shift amount can be in a register (in that case “shamt” shamt is not used) Shirt right arithmetic (sra) keeps the sign of a number (i == h)? h=i+j; LABEL: • sra $s7, $ 7 $ $s5, 5 4 CS/CoE0447: Computer Organization and Assembly Language NO bne $s5, b $ 5 $s6, $ 6 LABEL add $s6, $s6, $s4 LABEL: … CS/CoE0447: Computer Organization and Assembly Language University of Pittsburgh University of Pittsburgh 21 Control Control MIPS unconditional branch instruction (i.e., jump) j p • j LABEL 22 Loops p • “While” loops Example on page 74 Example • “For” loops • f, f g, g and h are in registers $s5 $s5, $s6 $s6, $s7 if (i == h) f=g+h; else f=g–h; YES (i == h)? NO f=g+h; bne $t6, $s7, ELSE add $ $s5,, $s6, $ , $s7 $ j EXIT ELSE: sub $s5, $s6, $s7 EXIT: … f=g–h EXIT CS/CoE0447: Computer Organization and Assembly Language CS/CoE0447: Computer Organization and Assembly Language University of Pittsburgh 23 University of Pittsburgh 24 Control Address in II-format format We have beq q and bne; what about branch-if-less-than? Branch/ Immediate • We have slt if (s1<s2) t0=1; else t0=0; slt $t0, $s1, $s2 op rs rt 16-bit 16 bit immediate Immediate address in an instruction is not a 32-bit value – it’s onlyy 16-bit • The 16-bit immediate value is in signed, 2’s complement form Can you make a “psuedo” instruction “blt $s1, $s2, LABEL”? Addressing in branch instructions • The 16 16-bit bit number in the instruction specifies the number of “instructions” to be skipped • Memory address is obtained by adding this number to the PC • Next address = PC + 4 + sign_extend(16-bit immediate << 2) Assembler needs a temporary register to do this • $at is reserved for this purpose Example • beq $s1, $s2, 100 4 CS/CoE0447: Computer Organization and Assembly Language 17 18 25 CS/CoE0447: Computer Organization and Assembly Language University of Pittsburgh University of Pittsburgh 25 J format J-format 26 MIPS addressing modes Jump op 26-bit 26 bit immediate Immediate addressing g (I-format) op The address of next instruction is obtained from PC and the immediate value • Next address = {PC[31:28],IMM[25:0],00} • Address boundaries of 256MB Example op rs rt op rs rt 2500 rs rs 27 shamt h f funct immediate rt offset rt offset Pseudo-direct addressing (j) [concatenation w/ PC] CS/CoE0447: Computer Organization and Assembly Language University of Pittsburgh rd d PC-relative addressing (beq/bne) [PC + 4 + offset] op CS/CoE0447: Computer Organization and Assembly Language immediate Base addressing (load/store) – [register + offset] op 2 rt Register addressing (R-/I-format) op • j 10000 rs 26-bit 26 bit address University of Pittsburgh 28 MIPS instructions CS/CoE0447: Computer Organization and Assembly Language Register usage convention NAME REG. NUMBER $zero 0 zero $at 1 reserved for assembler usage values for results and expression eval. $v0~$v1 2~3 $a0~$a3 4~7 arguments $t0~$t7 8~15 temporaries $s0~$s7 16~23 temporaries, saved $t8~$t9 24~25 more temporaries $k0~$k1 26~27 reserved for OS kernel $gp 28 global pointer $sp 29 stack pointer $fp 30 frame pointer $ra 31 return address CS/CoE0447: Computer Organization and Assembly Language University of Pittsburgh USAGE University of Pittsburgh 29 Stack and frame pointers 30 “C” C program down to numbers Stack pointer ($sp) • Keeps K the th address dd to t th the ttop off the stack • $29 is reserved for this purpose • Stack grows from high address to low • Typical stack operations are push/pop swap: Procedure frame • Contains C i saved d registers i and d llocall variables muli add dd lw lw sw sw jr $2, $5, 4 $2 $4, $2, $4 $2 $15, 0($2) $16, 4($2) $16, 0($2) $15, 4($2) $31 • “Activation record” Frame pointer ($fp) • • • • Points to the first word of a frame Offers a stable reference pointer $30 is reserved for this Some compilers p don’t use $ $fp p CS/CoE0447: Computer Organization and Assembly Language CS/CoE0447: Computer Organization and Assembly Language University of Pittsburgh 31 void swap(int p( v[], [], int k)) { int temp; temp = v[k]; v[k] = v[k+1]; v[k+1] = temp; } 00000000101000010… 00000000000110000… 10001100011000100… 10001100111100100… 10101100111100100… 10101100111100100 10101100011000100… 00000011111000000… University of Pittsburgh 32 Producing a binary Assembler Expands macros • Macro is a sequence of operations conveniently defined by a user • A single macro can expand to many instructions source file .asm/.s assembler object file .obj/.o source file .asm/.s assembler object file .obj/.o linker source file .asm/.s assembler object file .obj/.o library .lib/.a Determines addresses and translates source into binary numbers • • • • executable .exe Start from a pre-defined address Record in “symbol table” addresses of labels Resolve l b branch h targets and d complete l b branch h iinstructions’ i encoding di Record instructions that need be fixed after linkage Packs everything in an object file “Two-pass assembler” • To handle forward references CS/CoE0447: Computer Organization and Assembly Language CS/CoE0447: Computer Organization and Assembly Language University of Pittsburgh University of Pittsburgh 33 Object file 34 Important assembler directives Header • Size Si and d position i i off other h pieces i off the h fil file Assembler directives guide the assembler to properly h dl source code handle d with i h certain i considerations id i .text Text segment Data segment Relocation information .data .align • Machine codes • Tells assembler that a “code” segment follows • Binary representation of the data in the source • Identifies instructions and data words that depend on absolute • Tells assembler that a “data” segment g follows addresses dd Symbol table • Directs aligning following items • Keeps addresses of global labels • Lists unresolved references .global • Contains a concise description of the way in which the program was .asciiz Debugging information • Tells to treat following symbols as global compiled CS/CoE0447: Computer Organization and Assembly Language • Tells to handle following input as a “string” string CS/CoE0447: Computer Organization and Assembly Language University of Pittsburgh 35 University of Pittsburgh 36 Code example .text .align align .globl Macro example .data 2 main int str: int_str: subu … $sp, $sp, 32 .macro lw … la lw jal … jr $t6, 28($sp) main: loop: .data .align $a0, str $a1, 24($sp) printf .asciiz “%d” .text print_int($arg) la $a0, int_str mov $a1, $arg jal printf .end_macro … print_int($7) $ra la $a0, int_str mov $a1, $7 jal printf 0 str: .asciiz ii “Th sum f “The from 0 … 100 i is %d\ %d\n” ” CS/CoE0447: Computer Organization and Assembly Language CS/CoE0447: Computer Organization and Assembly Language University of Pittsburgh University of Pittsburgh 37 Linker 38 Procedure call Argument g passing p g • First 4 arguments are passed through $a0~$a3 • More arguments are passed through stack Result passing • First 2 results are passed through $v0~$v1 • More results can be passed through stack Stack manipulations can be tricky and error-prone • Application binary interface (ABI) has formal treatment of procedure d calls ll and d stack k manipulations i l i CS/CoE0447: Computer Organization and Assembly Language More will be discussed in labs. CS/CoE0447: Computer Organization and Assembly Language University of Pittsburgh 39 University of Pittsburgh 40 High-level High level language High-level language Example C, Fortran, Java, … Interpreter vs. compiler Interpreter Compiler Concept Line-by-line translation and execution u o Whole program translation and native a execution u o Example Java, BASIC, … C, Fortran, … Assembly - + High productivity – Short description & readability Portability Low productivity – Long description & low readability Not portable + Interactive Portable (e.g., Java Virtual Machine) High performance – Limited optimization capability in certain cases With p proper p knowledge g and experiences, fully optimized codes can be written – Low performance Binary not portable to other machines or generations CS/CoE0447: Computer Organization and Assembly Language CS/CoE0447: Computer Organization and Assembly Language University of Pittsburgh University of Pittsburgh 41 Compiler structure 42 Compiler front front-end end Compiler p is a veryy complex p software with a varietyy of functions and algorithms implemented inside it Scanning • Takes T k the th input i t source program and d chops h th the programs iinto t recognizable “tokens” • Also known as “lexical analysis” and “LEX” tool is used Front end Front-end • Deals with high-level language constructs and translates them into • “YACC” or “BISON” tools are used Semantic analysis IR generation i Back-end • Takes the abstract syntax trees, performs type checking, and builds a symbol table • Back-end k d IR more or lless correspond d to machine hi iinstructions i • With many compiling steps, IR items becomes machine instructions CS/CoE0447: Computer Organization and Assembly Language • Takes the token stream, checks the syntax, and produces abstract syntax trees more relevant tree-like or list-format internal representation (IR) • Symbols (e (e.g., g a[i+8*j]) are still available Parsing g • Similar to assembly language program, but assumes unlimited registers, etc. CS/CoE0447: Computer Organization and Assembly Language University of Pittsburgh 43 University of Pittsburgh 44 Compiler back back-end end Code & abstract syntax tree Local optimization • Optimizations within a basic block • Common sub-expression elimination (CSE), copy propagation, while hil ( (save[i] [i] == k) i+=1; constant propagation, dead code elimination, … Global optimization Loop optimizations • Optimizations that deal with multiple basic blocks • Loops are so important that many compiler and architecture optimizations i i i target them h • Induction variable removal, loop invariant removal, strength reduction, … Register g allocation Machine-dependent optimizations • Try to keep as many variables in registers as possible • Utilize any useful instructions provided by the target machine CS/CoE0447: Computer Organization and Assembly Language CS/CoE0447: Computer Organization and Assembly Language University of Pittsburgh University of Pittsburgh 45 Internal representation CS/CoE0447: Computer Organization and Assembly Language 46 Control flow graph CS/CoE0447: Computer Organization and Assembly Language University of Pittsburgh 47 University of Pittsburgh 48 Loop optimization CS/CoE0447: Computer Organization and Assembly Language Register allocation CS/CoE0447: Computer Organization and Assembly Language University of Pittsburgh University of Pittsburgh 49 Compiler example: gcc gcc = GNU open-source C Compiler cpp: C pre-processor Compiler optimization is not uncommon cc1: C compiler p ((front-end & back-end)) • Not that serious in many cases as compiling is a rare event gas: assembler compared to program execution • Considering the resulting code quality, you pay this fee • Now, computers are fast! • Assembles compiler-generated p g assemblyy codes ld: linker • Links all the object files and library files to generate an executable CS/CoE0447: Computer Organization and Assembly Language Turning on optimizations may incur a significant compile time overhead • For a big, complex software project, compile time is an issue • Compiles your C program Optimizations p are a must • Code quality can be much improved; 2~10 times improvement • Macro expansion p ((#define)) • File expansion (#include) • Handles other directives (#if, #else, #pragma) 50 CS/CoE0447: Computer Organization and Assembly Language University of Pittsburgh 51 University of Pittsburgh 52 Wrapping up MIPS ISA • Typical RISC style architecture • 32 GPRs • 32-bit instruction format • 32-bit address space Operations & operands Arithmetic, logical, g memoryy access, control instructions Compiler, interpreter, … CS/CoE0447: Computer Organization and Assembly Language University of Pittsburgh 53