어셈블리프로그래밍 설계 및 실습 컴퓨터정보공학부 이형근 (hklee@kw.ac.kr) ARM Instruction Set ⚫ Main features ◎ 32-bit ARM instructions ◎ Conditionally executed instructions ▶All instructions execute conditionally depending on flags in CPSR. ◎ Load/Store architecture instructions ▶All instructions but load and store, handles data in registers. ⚫ Instruction types ◎ Data Processing Instructions ◎ Data Transfer Instructions ◎ Control Flow Instructions ARM Instruction Set ARM Instruction Set Format Data Processing Instructions Data Processing Instructions ⚫ Add & Subtract operation ◎ Syntax opcode{cond}{S} Rd, Rn, <Operand2> ▶opcode ADD, ADC : add (w. carry) SUB, SBC : subtract (w. carry) RSB, RSC : reverse subtract (w. carry) ▶cond : Execution condition code ▶S : Status update suffix ▶Rd : Destination register ▶Rn : Source(Operand1) register Data Processing Instructions ⚫ Add & Subtract operation ▶<Operand2> : Types of operand2 ▷ #Immediate ❖ Even-shifted 8-bit pattern ▷ Rm: Source(Operand2) register ❖ Rm{, type #immediate/Rs}: Rm shifted/rotated by #immediate/Rs ✓ type LSL, LSR : logical shift left/right ASL, ASR : arithmetic shift left/right ROR : rotate right RRX : rotate right extended with carry ✓ Rs: Shift amount register 7 Shift Operations ⚫ Shift operation on 2nd operand register before execution ◎ LSL: logical shift left ◎ LSR: logical shift right ◎ ASR: arithmetic shift right ◎ ROR: rotate right ◎ RRX: rotate right extended by carry Immediate Second Operand ⚫ 8-bit unsigned integer with a range of 0 ~ 255 ◎ Rotated right by 4-bit number+’0’ (i.e. RORs by 0, 2, 4, …, 30) ⚫ Finally, valid immediate values given by: (0 ~ 255) × 22n where 0 ≤ n ≤ 15 Data Processing Instructions ⚫ Add & Subtract operation ◎ Examples: ADD R0, R1, #7 ; Computes R1+7 and stores the result in R0 ADD R0, R1, R2 ; Computes R1+R2, and stores the result in R0 ADD R3, R5, R6, LSL #2 ; Computes R5+(R6 shifted left by 2) and stores the result in R3 ADDS R4, R7, R8, ASR R9 ; Computes R7+(R8 shifted right by R9), stores the result in R4 and set flags Data Processing Instructions ⚫ Logical operation ◎ Syntax opcode{cond}{S} Rd, Rn, <Operand2> ▶opcode AND, ORR, EOR : bit-wise logical and, or, exclusive-or BIC : bit clear ▶cond : Execution condition code ▶S : Status update suffix ▶Rd : Destination register ▶Rn : Source(Operand1) register Data Processing Instructions ⚫ Logical operation ▶<Operand2> : Types of operand2 ▷ #Immediate ❖ Even-shifted 8-bit pattern ▷ Rm: Source(Operand2) register ❖ Rm{, type #immediate/Rs}: Rm shifted/rotated by #immediate/Rs ✓ type LSL, LSR : logical shift left/right ASL, ASR : arithmetic shift left/right ROR : rotate right RRX : rotate right extended with carry ✓ Rs: Shift amount register Data Processing Instructions ⚫ Logical operation ◎ Examples: AND R0, R3, R4 ; Computes R3∧R4 and stores the result in R0 ORR R1, R5, #0xFF00 ; Computes R5∨0xFF00, and stores the result in R1 EOR R2, R6, R7, ROR R8 ; Computes R6⊕(R7 rotated right by R8) and stores the result in R2 BICS R0, R1, R2 ; Computes R1∧(¬R2), stores the result in R0 and set the flags Data Processing Instructions ⚫ Compare & Test operation ◎ Syntax opcode{cond} Rn, <Operand2> ▶opcode CMP : compare CMN : negated compare TST : bit-wise AND TEQ : bit-wise XOR ▶cond : Execution condition code ▶Rn : Source(Operand1) register ▶<Operand2> : Types of operand2 ▷ #Immediate ▷ Rm: Source(Operand2) register Data Processing Instructions ⚫ Compare & Test operation ◎ Examples: CMP R1, R2 ; Computes R1-R2, set the flags, and does not store the result CMN R3, R4 ; Computes R3+R4, set the flags, and does not store the result TST R5, R6 ; Computes R5∧R6, set the flags, and does not store the result TEQ R7, R8 ; Computes R7⊕R8, set the flags, and does not store the result Data Processing Instructions ⚫ Move operation ◎ Syntax opcode{cond}{S} Rd, <Operand2> opcode{cond} Rd/PSR, PSR/Rn ▶opcode MOV, MVN : move (negated) MRS : move PSR to register MSR : move register to PSR ▶cond : Execution condition code ▶S : Status update suffix ▶Rd : Destination register ▶PSR : CPSR or SPSR register ▶Rn : Source register Data Processing Instructions ⚫ Move operation ◎ Examples: MOV PC, LR ; Move the value of LR(R14) to PC(R15) MOV R1, R2 ; Move the value of R2 to R1 MVN R3, R4 ; Move the value of (¬R4) to R3 MRS R5, CPSR ; Move the value of CPSR to R5 MSR SPSR, R7 ; Move the value of R7 to SPSR Data Processing Instructions ⚫ Multiply operation ◎ Syntax opcode{cond}{S} Rd/RdLo, Rm/RdHi, Rs/Rm, {Rn}/Rs ▶opcode MUL, MLA : multiply (and accumulate) UMULL, UMLAL : multiply unsigned long (and accumulate) SMULL, SMLAL : multiply signed long (and accumulate) ▶cond : Execution condition code (explained later) ▶S : Status update suffix ▶Rd : Destination register ▶Rm, Rs : Source registers ▶Rn : Accumulated register ▶RdLo, RdLo : Destination registers for 64-bit result (and accumulated registers) Data Processing Instructions ⚫ Multiply operation ◎ Examples: MUL R10, R2, R5 ; Computes R2ⅩR5 and stores the lower 32-bit result in R10 MLA R11, R2, R5, R1 ; Computes R2ⅩR5+R1 and stores the lower 32-bit result in R11 UMULL R0, R4, R5, R6 ; Computes R5ⅩR6 and stores the 64-bit result in [R4:R0] UMLAL R4, R5, R3, R8 ; Computes R3ⅩR8+[R5:R4] and stores the 64-bit result in [R5:R4] Data Transfer Instructions Data Transfer Instructions ⚫ Load & Store operation ◎ Syntax opcode{type}{cond} Rd/Rn, <address_mode> ▶ opcode LDR : load (memory → register) STR : store (register → memory) ▶ type : B - unsigned Byte (Zero extension to 32 bits) SB - signed Byte (Sign extension to 32 bits) H - unsigned Halfword (Zero extension to 32 bits) SH - signed Halfword (Sign extension to 32 bits) ▶ cond : Execution condition code ▶ Rd/Rn : Destination register/Source register ▶ <address_mode> ▷ register indirect ▷ register indirect with immediate offset ▷ register indirect with a (scaled) register Data Transfer Instructions ⚫ Load & Store operation ◎ Example LDR R2, [R0] ; Loads R2 with the word pointed at by R0 LDR R0, [R1, #20] ; Loads R0 with the word pointed at by R1+20 LDR R1, [R1, -R2] ; Loads R1 with the word pointed at by R1-R2 LDR R3, [R4, R5 LSL #2] ; Loads R3 with the word pointed at by R4+(R5 shifted left by 2) LDR R6, [R7, #4]! Auto-indexing ; Loads R6 with the word pointed at by R7+4 then update the pointer by adding 4 to R7 (Pre-indexing) LDR R6, [R7], #4 ; Loads R6 with the word pointed at by R7 then update the pointer by adding 4 to R7 (Post-indexing) Data Transfer Instructions ⚫ Load & Store operation ◎ LDR/STR cannot refer to a direct address in an instruction. ◎ LDR/STR generate an address by performing arithmetic on a register or a scaled register plus an offset. ◎ Pseudo-instructions can be used to calculate address, to load 32-bit value and so on. ADR, LDR, ADRL ✓ Pseudo-instructions are translated into the appropriate combination of ARM instructions at assembly time. Data Transfer Instructions ⚫ Multiple Load & Store operation ◎ Syntax opcode{access_mode}{cond} Rn{!}, reglist LDM : load (memory pointed by base register → multiple registers) STM : store (multiple register → memory pointed by base register) ▶<access_mode> ▷ IA - Increment address After each transfer (default) ▷ IB - Increment address Before each transfer ▷ DA - Decrement address After each transfer ▷ DB - Decrement address Before each transfer ▶cond : Execution condition code ▶Rn : Base register ▶! : The final address is written back into Rn. ▶reglist : A list of one or more registers to be loaded (enclosed in braces and separated by commas) Data Transfer Instructions ⚫ Multiple Load & Store operation ◎ Example LDM R8, {R0, R2, R9} ; Loads R0, R2 and R9 with the words pointed by R8 in IA LDMDA R0, {R1-R6} LDMDA R0, {R1, R2, R3, R4, R5, R6} ; Loads R1 through R6 with the word pointed by R0 in DA LDM R0!, {R1-R3} ; Loads R1 through R3 with the word pointed by R0 in IA, and updates R0 as a new address Data Transfer Instructions ⚫ Swap operation ◎ Syntax opcode{cond} Rd, Rm, [Rn] SWP : Swap a word between a register and memory SWPB : Swap a byte between a register and memory ▶cond : Execution condition code ▶Rd : Destination register ▶Rm : Source register ▶Rn : Base register Control Flow Instructions Control Flow Instructions ⚫ Branch operation ◎ Syntax opcode{cond} <label> ▶opcode B : Branch to label BL : Branch to label with Link ▶cond : Execution condition code ▶label : PC-relative address in memory Control Flow Instructions ⚫ Branch operation ◎ The BL instruction causes a branch to label, and copies the address of the next instruction into LR(R14). ▶A program uses BL instruction to call a subroutine. BL sub1 ;copies current PC to r14 ▶LR(R14) has a return address. MOV R15, R14 ;returns from subroutine Conditional Execution Condition Field {cond} ⚫ All operations can be performed conditionally, testing condition flags in CPSR. ◎ All instructions contain a condition field which determines whether the CPU will execute them. 31 28 N Z C V 4 8 I F T 0 Mode CPSR ⚫ Condition flags ◎ N : set when the result was Negative (MSB of the result is 1) ▶When (-2)+(-5)=-7 in 2’s complement (MSB=1), so N is set to 1. ◎ Z : set when the result was Zero ▶10+(-10)=0, then Z is set to 1. ▶(0xAA)∧(0x33)=0x00, then Z is set to 1. Condition Field {cond} ⚫ Condition flags ◎ C : set when a Carry occurs ▶if the result of addition is greater than 232-1, ▶if the result of subtraction is positive (no borrow), ❖ Ex: 1101–1000 = 1101+(0111+1) = (1)0101 → C=1 ▶if the result of shift operation is greater than 232-1, ◎ V : set when oVerflow occurs ▶if the result is greater than 231-1, or less than –231 ▷ When two -ve numbers are added, the result is +ve (underflow). ▷ When two +ve numbers are added, the result is -ve (overflow). ❖ Ex: (0x7FFFFFFF+1) will set V=1, because 0x7FFFFFFF is the biggest integer in 2’s complement. -2,147,483,648(-0x80000000) ≤ a signed 32-bit integer ≤ +2,147,483,647(+0x7FFFFFFF) Condition Field {cond} ⚫ 15 different conditions ◎ To add two numbers without any condition (always), ADD R0, R1, R2 ; R0=R1+R2 (: ADDAL) ◎ To add two numbers only if the zero flag is set, ADDEQ R0, R1, R2 ; If zero flag set, then R0=R1+R2 Code 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 Suffix EQ NE CS/HS CC/LO MI PL VS VC HI LS GE LT GT LE AL Flags Meaning (after CMP or SUBS) Z set Z clear C set C clear N set N clear V set V clear C set AND Z clear C clear OR Z set N equals V N not equal to V Z clear AND (N equals V) Z set OR (N not equal to V) (ignored) Equal Not equal Unsigned higher or same (unsigned >=) Unsigned lower (unsigned <) Minus / Negative Plus / Positive or zero Overflow No overflow Unsigned higher (unsigned >) Unsigned lower or same (unsigned <=) Greater or equal (signed >=) Less than (signed <) Greater than (signed >) Less than or equal (signed <=) always Status Update Field {S} ⚫ No changes on condition flags by data processing instructions ◎ Different from the CMP, CMN, TEQ and TST ◎ To cause the condition flags to be updated, append “S”. ◎ Example ▶To add two numbers and set the condition flags, ADDS R0, R1, R2 ; R0=R1+R2 and set flags Condition Field Example 1 ⚫ Condition of EQ When R1 is equal to R2, CMP R1, R2 Condition Field Example 2 ⚫ Condition of CS/HS, PL and HI When R1 is higher than R2, CMP R1, R2 Condition Field Example 3 ⚫ Condition of CC/LO, MI and LS When R1 is lower than R2, CMP R1, R2 Condition Field Example 4 ⚫ Condition of VS, GT and GE When R1-R2, is greater than 231-1, or less than –231 CMP R1, R2 Condition Field Example 5 ⚫ Condition of VS, LT and LE When R1-R2, is greater than 231-1, or less than –231 CMP R1, R2 Conditional Branch Example 1 C: if (a > b) { x = 5; y = c + d; } else x = c - d; Assembly: ; false block ; start of if-statement fblock ADR r4,a ; get address for a LDR r0,[r4] ; get value of a ADR r4,c ; get address for c ADR r4,b ; get address for b LDR r0,[r4] ; get value of c LDR r1,[r4] ; get value for b ADR r4,d ; get address for d CMP r0,r1 ; compare a and b LDR r1,[r4] ; get value for d BLE fblock ; if a <= b, branch to fblock SUB r0,r0,r1 ; compute a - b ADR r4,x ; get address for x STR r0,[r4] ; store value of x ; true block MOV r0,#5 ; generate value for x ADR r4,x ; get address for x ; end of if-statement STR r0,[r4] ; store x after ADR r4,c ; get address for c LDR r0,[r4] ; get value of c ADR r4,d ; get address for d LDR r1,[r4] ; get value of d ADD r0,r0,r1 ; compute y ADR r4,y ; get address for y STR r0,[r4] ; store y B after ; branch around false block ... Conditional Branch Example 2 ; start of if-statement ADR r4,a ; get address for a LDR r0,[r4] ; get value of a ADRGE r4,c ; get address for c ADR r4,b ; get address for b LDRGE r0,[r4] ; get value of c LDR r1,[r4] ; get value for b ADRGE r4,d ; get address for d CMP r0,r1 ; compare a and b LDRGE r1,[r4] ; get value for d SUBGE r0,r0,r1 ; compute a-b ; true block ; false block MOVLT r0,#5 ; generate value for x ADRGE r4,x ; get address for x ADRLT r4,x ; get address for x STRGE r0,[r4] ; store value of x STRLT r0,[r4] ; store x ; end of if-statement ADRLT r4,c ; get address for c after LDRLT r0,[r4] ; get value of c ADRLT r4,d ; get address for d LDRLT r1,[r4] ; get value of d ADDLT r0,r0,r1 ; compute y ADRLT r4,y ; get address for y STRLT r0,[r4] ; store y ... Conditional Execution ⚫ Most branch operations can be performed by conditional instructions. ◎ Non-executed instructions soak up 1 cycle. ▶Still have to complete cycle so as to allow fetching and decoding of following instructions. ◎ This removes the need for many branches, which stall the pipeline (3 cycles to refill). ◎ Allows very dense in-line code, without branches. ◎ The time penalty of not executing several conditional instructions is frequently less than overhead of the branch or subroutine call that would otherwise be needed. Special Purpose Registers Special Purpose Registers ⚫ Stack Pointer (R13) ◎ To hold the address of the stack in memory ⚫ Link Register (R14) ◎ To hold the return address when a subroutine or an exception completes MOV R15, R14 or MOV PC, LR ⚫ Program Counter (R15) ◎ To point the address where the instruction being fetched ▶ All instructions are 32-bit aligned. ▷ PC value is stored in bits [31:2] with bits [1:0] equal to zero. ▶ Pipeline to increase the speed of instruction executions ▷ PC points to the instruction to be fetched. PC FETCH Instruction fetched from memory PC - 4 DECODE Decoding of instruction PC - 8 EXECUTE Read Register(s) Shift and ALU operation Write register(s) back Special Purpose Registers ⚫ CPSR (Current Program Status Register) ◎ To hold processor status and control information 31 28 N Z C V ▶ N, Z, C, V (condition flags) ▶ I (IRQ mask bit): disables IRQ interrupts when set ▶ F (FIQ mask bit): disables FIQ interrupts when set ▶ T (Thumb mode bit): indicates Thumb state when set ▶ M4~M0 (Mode bits): control the operation modes 4 8 I F T 0 CPSR Mode M[4:0] 10000 10001 10010 10011 10111 11011 11111 Mode User FIQ IRQ Supervisor Abort Undefined System ⚫ SPSR (Saved Program Status Register) ◎ To store the current value of the CPSR when an exception is taken