ARM Assembly Instructions & Program Analysis

System Programming
Class 3
Basic ARM Instructions (II)
Example Program Analysis
Shift Operations
 ARM has no actual shift instruction
 Shift instructions are embedded in other instructions
– MOV r0, r1, LSL #2
– MOV r0, r1, LSL r2
: r0  r1 << 2
: r0  r1 << r2
 Only second operand can have barrel shifter
– ADD r0, r1, r2, LSL #3 (O)
– ADD r0, r1, LSL #3, r2 (X)
Barrel Shifter – Left Shifts
LSL <amount> where < amount> is constant or register
 Second operand is shifted left by specified amount
 Last bit going out of 32-bit body is stored at C flag
 Left shift can simulate constant multiplication
r0  r1 x 5:
ADD r0, r1, r1, LSL #2
Q: r2  r3 × 135 in two instructions?
• Hint: 135 = 9 * 15 = (8+1) * (16-1)
Barrel Shifter – Right Shifts
 Logical Shift Right:
LSR <amount>
– Inverse shifts of LSL
– Unsigned divisions by
powers of two
 Arithmeic Shift Right:
ASR <amount>
– Sign bit is reserved
– Signed divisions by
powers of two
Barrel Shifter – Rotations
 Rotate Right:
ROR <amount>
– At each shift, leaving bit
appears at MSB
– Last leaving bit is
marked at C flag
 Rotate Right Extended:
– Shifts one bit including C
flag as 33rd bit
 Why is there no rotate left?
ALU with Barrel Shifter
MOV r0, r1, LSL #7 @(○)
MOV r0, 0x7e
MOV r0, 0x27c0
@ =0x9f, LSL #6
MOV r0, 0x47f0
MOV r0, 0x17a0
@ =0xbd, LSL #5
MOV r0, 0xffffffff @(○)
@ =MVN r0, 0x0
Software Interrupt – SWI
 The only instruction for user-level software
to access system resources
 User-level software cannot write on system-related
– Some system resources are even unreadable
 SWI hands over the control to system
@ sys_write ( fd, pstr, len )
@ r7=4 r0 r1 r2
mov r0, #1
@ fd <- stdout
adr r1, msg @ pstr <- msg
mov r2, #14 @ len <- 14
mov r7, #4
@ syscall <- sys_write
swi 0
@ system call
 Detail will be discussed later
ARM Assembly language and GNU Assembler
 We have complete set of instructions to do with ARM
assembly language
– Additional variations of instructions are to enrich
expressiveness and to accelerate performance;
no more functionality
 Assembly language vs. Assembler
– Assembly language
• How to represent machine code in more human readable form
• CPU manufacturer defines assembly language
– Assembler
• Tool to change assembly language to machine code
• How to represent data and how to organize sections is given
by assembler directives
• We use GNU Assembler, which can translate ARM as well
as x86 assembly language into respective machine language
GNU Assembler Directives (I)
 Labels and comments
loop: instruction @ comment
– Any identifier followed by : is label to indicate addresses for
branching, procedure calls or memory operations such as
– Anything following @ is comment
 Directives for sections
Program has multiple sections
.text: read-only section for executable and constants
.data: writable section for global or static variables
.end: end of program; anything after this is ignored
.ltorg: where literal pool (constants defined in middle of
program) is located; not required but specify when you want
precise control over machine code organization
GNU Assembler Directives for External Files
 .include “<filename>”: same as #include of C
– If included file has .end directive inside, assembler will
terminate translation there
 .global <symbol>: to declare label to be globally
– .global _start is special label at which the program starts
 .extern <symbol>: to use globally visible label in other
GNU Assembler Directives for Data (I)
 .byte <expression>: reserve and initialize byte-sized
memory; more than one expression may come
.byte 64
.byte ’A’ 0b1000010, 0x43, 0104
 .hword <expression> or .2byte <expression>
.hword 0xAA55, 12345
.2byte 0x55AA, -1
 .word <expression> or .4byte <expression>
 .align: inserts zero-initialized bytes to make the
following data word aligned
.byte 64
.word 0xdeadbeef
Q: Why word aligned?
GNU Assembler Directives for Data (II)
 .asciz “<string>": inserts string literal followed by NULL
character (0x00)
.asciz "Hello, world!\n"
 .ascii “<string>": inserts string literal without following
NULL character
.ascii "Hello, world!\n"
 .skip <length>: assign memory chunk with given size
.skip 512
 .set <symbol> <expression>: defining alias
– .equ is equivalent, and = can also be used
– Following three examples are same
.equ adams, (5 * 8) + 2
.set adams, 0x2A
adams = 0b00101010
Hello, world! printing char-by-char in C
#include <stdio.h>
char msg[] = "Hello, world\n";
char idx = 0;
while(idx < 14)
Hello, world! printing char-by-char in ASM (1)
_start: .global _start
@ r3 <= msg
@ r4 <= idx
adr r3, msg
mov r4, #0
cmp r4, #14
bge exit
r3 = msg
r4 = idx = 0
while(idx < 14)
Hello, world! printing char-by-char in ASM (2)
@ r3 <= msg
@ r4 <= idx
@ sys_write(fd,
@ r7 = 4
mov r0, #1
add r1, r3, r4
mov r2, #1
mov r7, #4
swi 0
add r4, r4, #1
b loop
ptr, len);
@ fd = stdout
@ ptr = @msg[idx]
@ len = 1
@ **can move above loop
@ sys_write(1, &msg[idx], 1)
@ idx ++;
@ }
Hello, world! printing char-by-char in ASM (3)
@ sys_exit(0)
@ r7 = 1 r0
mov r0, #0 @ exit code = 0
mov r7, #1
swi 0 @ sys_exit(0)
msg: .asciz "Hello, World!\n"
ARM Instruction Format
