CMPE 325 Computer Architecture II

advertisement
CMPE 325 Computer
Architecture II
Cem Ergün
Eastern Mediterranean
University
Using Assembly
Using Arrays for Counting

Consider the C code for counting an array where
we have int target, int n, and int *list
available in parameters $a0-$a2
int count = 0;
int i;
for (i = 0; i < n; i++) {
if (list[i] == target) count++;
}
CMPE325 CH #3
Slide #2
Using Arrays Solution

Writing the loop
Loop:
Next:
li $t0, 0
li $t1, 0
bge $t1, $a1, Exit
add $t2, $t1, $t1
add $t2, $t2, $t2
add $t3, $t2, $a2
lw $t4, 0($t3)
bne $t4, $a0, Next
addi $t0, $t0, 1
addi $t1, $t1, 1
j Loop
#
#
#
#
#
#
#
#
#
#
#
count = 0
i = 0
goto Exit if i >= n
$t2 = 2 * i
$t2 = 4 * i
$t3 = list + 4 * i
$t4 = list[i]
goto Next if $t4!=target
count++;
i++;
Loop again
Exit:
CMPE325 CH #3
Slide #3
MIPS Assembler Directives


SPIM supports a subset of the MIPS assembler
directives
Some of the directives include:






.asciiz – Store a null-terminated string in memory
.data – Start of data segment
.global – Identify an exported symbol
.text – Start of text segment
.word – Store words in memory
See Appendix A for details and examples
CMPE325 CH #3
Slide #4
Representing Instructions

High-level  Assembly  Machine
.c
C Program
Compiler
.s Assembly Program
Assembler
.o
Machine Object
Module Object
Linker
Executable
Loader
Memory
CMPE325 CH #3
Slide #5
Assembler

Expands macros and pseudoinstructions as well
as converts values (ex. 0xFF for hex)

Primary purpose is to produce object file
containing



Machine language instructions
Application data
Information for memory organization
CMPE325 CH #3
Slide #6
Object File

Includes






Object header – describes file organization
Text segment – machine code
Data segment – static and dynamic data
Relocation information – identifies instructions/data
that depend on absolute addresses when program is
loaded
Symbol table – list of labels that are not defined (ex.
external references)
Debugging information – describes relationship
between source code and machine instructions
CMPE325 CH #3
Slide #7
Linker
Linker combines multiple object modules

Identify where code/data will be placed in memory
Resolve code/data cross references
Produces executable if all references found




Steps
1.
2.
3.

Place code and data modules in memory
Determine the address of data and instruction labels
Patch both the internal and external references
Separation between compiler and linker makes
standard libraries an efficient solution to
maintaining modular code
CMPE325 CH #3
Slide #8
Loader

Loader used at run-time
1.
2.
3.
4.
5.
6.
7.
Reads executable file header for size of text/data
segments
Create address space sufficiently large
Copy instructions and data from executable into
memory
Copy parameters to main program’s stack
Initialize machine registers and set SP
Jump to start-up routine
Makes exit system call when program is done
CMPE325 CH #3
Slide #9
Instruction Encoding


As we have seen, there are several different
ways that instructions are written, depending
upon what types of information they need
MIPS architecture has three instruction formats,
all 32 bits in length


A 6 bit opcode appears at the beginning of each
instruction


Regularity is simpler and improves performance
Needed by control logic to be able to decode instruction
type
See Appendix A.10 and Page 153 for a list
CMPE325 CH #3
Slide #10
Machine Language




All instructions have the same length (32 bits)
DP3: Good design demands good compromises
 Same length or same format
Three different formats
 R: arithmetic instruction format
 I:
transfer, branch, immediate format
 J:
jump instruction format
add $t0, $s1, $s2
10101101001010000000010010110000
 32 bits in machine language 00000010010010000100000000100000
10001101001010000000010010110000
 Fields for:
lw $t0, 1200($t1)
 Operation (add)
$t0, $s2, $t0
• Operands ($s1, $s2, $t0) add
sw $t0, 1200($t1)
A[300] = h + A[300];
CMPE325 CH #3
Slide #11
Instruction Formats
6 bits
5 bits
5 bits
5 bits
5 bits
6 bits
R:
op
rs
rt
rd
shamt
funct
I:
op
rs
rt
J:
op
address / immediate
target address
op: basic operation of the instruction (opcode)
rs: first source operand register
rt: second source operand register
rd: destination operand register
shamt: shift amount
funct: selects the specific variant of the opcode (function code)
address: offset for load/store instructions (+/-215)
immediate: constants for immediate instructions
CMPE325 CH #3
Slide #12
Example
A[300] = h + A[300];
/* $t1 <= base of array A; $s2 <= h */
Compiler
lw $t0, 1200($t1)
add $t0, $s2, $t0
sw $t0, 1200($t1)
# temporary register $t0 gets A[300]
# temporary register $t0 gets h +A[300]
# stores h + A[300] back into A[300]
Assembler
35
0
43
9
18
9
8
8
8
100011
000000
101011
01001
10010
01001
01000
01000
01000
8
1200
0
1200
32
0000 0100 1011 0000
01000
00000
100000
0000 0100 1011 0000
CMPE325 CH #3
Slide #13
R-Format


Used by ALU instructions
Uses three registers: one for destination and two
for source
Bits
6
5
OP=0
rs
5
rt
5
5
6
rd
sa
funct
First Second Result
Shift Function
Source Source Register Amount Code
Register Register
(Chap 4)

Function code specifies which operation
CMPE325 CH #3
Slide #14
R-Format Example

Consider the add instruction
add $8, $17, $18

Bits
We can fill in each of the fields
6
5
OP=0
17
5
18
5
8
5
6
0
32
First Second Result
Shift Function
Source Source Register Amount Code
Register Register
(Chap 4)
000000
10001
10010
01000
CMPE325 CH #3
00000
100000
Slide #15
R-Format Limitations


The R-Format works well for ALU-type
operations, but does not work well for some of
the other instructions we have seen
Consider for example the lw instruction which
takes an offset
If placed in an R-format, would only have 5 bits of
space for the offset
 Offsets of only 32 are not all that useful!
 A good design requires good compromises, so a single
instruction format is not possible

CMPE325 CH #3
Slide #16
Immediates (Numerical Constants)




Small constants are used frequently (50% of
operands)
 A = A + 5;
 C = C – 1;
Solutions
 Put typical constants in memory and load them
 Create hardwired registers (e.g. $0 or $zero)
Rule4: make the common case fast
MIPS instructions for constants (I format)
 addi $t0, $s7, 4
# $t0 = $s7 + 4
8
001000
23
10111
8
01000
44
0000 0000 0000 0100
CMPE325 CH #3
Slide #17
I-Format

The immediate instruction format




Bits
Uses different opcodes for each instruction
Immediate field is signed (positive/negative)
Used for loads and stores as well as immediate
instructions (addi, lui, etc.)
Also used for branches since branch destination is PC
relative
6
5
OP
rs
5
rt
First Second
Source Source
Register Register
CMPE325 CH #3
16
imm
Immediate
Slide #18
I-Format Example

Consider the addi instruction
addi $8, $9, 1

# $t0 = $t1 + 1
Fill in each of the fields
Bits
6
5
8
9
5
16
1
8
Immediate
First Second
Source Source
Register Register
001000
01001
01000
0000000000000001
CMPE325 CH #3
Slide #19
Another I-Format Example

Consider the while loop from before
Loop:
add $t0, $s0, $s0
add $t0, $t0, $t0
add $t1, $t0, $s3
lw $t2, 0($t1)
bne $t2, $s2, Exit
add $s0, $s0, $s1
j Loop
#
#
#
#
#
#
#
$t0 = 2 * i
$t0 = 4 * i
$t1 = &(A[i])
$t2 = A[i]
goto Exit if !=
i = i + j
goto Loop
Exit:

Pretend the first instruction is located at address
80000
CMPE325 CH #3
Slide #20
I-Format Example
(Incorrect)

Consider the bne instruction
bne $t2, $s2, Exit

# goto Exit if $t0 != $S5
Fill in each of the fields
Bits
6
5
5
10
5
16
8
18
Immediate
First Second
Source Source
Register Register
000101

01010
10010
0000000000001000
This is not the optimum encoding
CMPE325 CH #3
Slide #21
PC Relative Addressing



What can we improve about our use of
immediate addresses when branching?
Since instructions are always 32 bits long, and
since addressing is word aligned, we know that
every address must be a multiple of 4
Therefore, we actually branch to the address
that is PC + 4 + 4  immediate
CMPE325 CH #3
Slide #22
PC Relative Addressing
byte addr.
0000
Branch instructions use
PC-relative Addressing.
A−217:
Target is the label of address (B) in
instruction memory.
Memory
lw $t3,8($s4)
and $s0,$s1,$t0 −k
A−4: beq $s0,$s1,B
A: addi $s3,$s3,5
PC-relative byte address = B – A
Target-Address = B= PC + 4×Imm16
PC contains
address of the next
instruction = A
B: sub $t0,$s1,$t1
A+217-1: slti $at,$s1,$t0
16-bit signed Immediate word address
relative to next instruction = k = (B–A)/4
PC relative
word addr.
Farthest
backward
branch
215
address
FFFF
CMPE325 CH #3
−1
0
k
215-1
negative
imm16
= −k
ref. point
is next
instruction
positive
imm16
=k
=(B−A)/4
Farthest
forward
branch
address
Slide #23
I-Format Example
(Corrected)

Re-consider the bne instruction
bne $t2, $s2, Exit

# goto Exit if $t0 != $S5
Use PC-Relative addressing for the immediate
Bits
6
5
5
10
5
16
2
18
Immediate
First Second
Source Source
Register Register
000101
01010
10010
0000000000000010
CMPE325 CH #3
Slide #24
Branching Far Away
If the target is > 216 away, then the
compiler inverts the condition and
inserts an unconditional jump
 Consider the example where L1 is
far away

beq $s0, $s1, L1

# goto L1 if S$0=$s1
Can be rewritten as
bne $s0, $s1, L2
j L1
# Inverted
# Unconditional jump
L2:
CMPE325 CH #3
Slide #25
Far Target Address
Text Segment (252MB)
0x00400000
(0x07fe0000)
-217
PC (0x08000000)
beq $s0, $s1, L1
+217
(0x08020000)
bne $s0, $s1, L2
j
L1
(0x08200000) L1:
L2:
0x10000000
CMPE325 CH #3
Slide #26
I-Format Example:
Load/Store

Consider the lw instruction
lw $t2, 0($t1)

# $t2 = Mem[$t1]
Fill in each of the fields
Bits
6
5
35
9
5
16
0
10
Immediate
First Second
Source Source
Register Register
001000
01001
01010
0000000000000000
CMPE325 CH #3
Slide #27
Direct Memory Addressing


When loading/storing, sometimes it is necessary
to address a full 32 bits
Many options, including:

Use a 32 bit constant already stored in a register
lw $t1, 0($t0)

# Load using register $t0
Load an address constant from a table in memory
lw $t0, 40($s0)
lw $t1, 0($t0)
# Load the 32 bit address
# Load contents at address
CMPE325 CH #3
Slide #28
J-Format

The jump instruction format





Bits
Uses different opcodes for each instruction
Used by j and jal instructions
Uses absolute addressing since long jumps are common
Uses word addressing as well (target  4)
Pseudodirect addressing where 228 bits from target, and
remaining 4 bits come from upper bits of PC
6
26
OP
target
Jump Target Address
CMPE325 CH #3
Slide #29
J-Format
2

imm26
Address-to-Jump = Page-Address+4×imm26
= (PC31, PC30, PC29, PC28, I25, I24,....., I1, I0, 0 , 0)two
Memory Page Address.
Leftmost-4-bits of the
Program Counter.
26-bit Immediate
word- address.
CMPE325 CH #3
shift-left 2-bit to
convert the wordaddress to the byteaddress.
Slide #30
Complete Example

Now we can write the complete example for our
while loop
80000
0
16
16
8
0
32
80004
0
8
8
8
0
32
80008
80012
0
8
19
9
0
32
35
5
9
10
10
18
0
16
17
80016
80020
80024
80028 …
2
0
2
16
0
32
20000
CMPE325 CH #3
Slide #31
SPIM Code
PC
MIPS
Pseudo MIPS
main
[0x00400020]
add $9, $10, $11
[0x00400024]
j
[0x00400028]
addi $9, $10, -50
addi $t1, $t2, -50
[0x0040002c]
lw
$8, 5($9)
lw $t0, 5($t1)
[0x00400030]
lw
$8, -5($9)
lw $t0, -5($t1)
[0x00400034]
bne $8, $9, 4 [exit-PC]
[0x00400038]
addi $9, $10, 50
addi $t1, $t2, 50
[0x0040003c]
bne $8, $9, -8 [main-PC] #(20-40)=-20H=-32/4
bne $t0, $t1, main
[0x00400040]
lb
$8, -5($9)
lb
$t0, -5($t1)
[0x00400044]
exit
[0x00400048]
j
0x00400020 / 4 [main-PC]
j
main
main:
0x00400048 / 4 [exit]
add $t1, $t2, $t3
j
bne $t0, $t1, exit
#(48-38)=10H=16/4
add $9, $10, $11
exit:
CMPE325 CH #3
exit
add $t1, $t2, $t3
Slide #32
Addressing Modes
1. Immediate addressing
op
rs
rt
Immediate
2. Register addressing
op
rs
rt
rd
...
funct
Registers
Register
3. Base addressing
op
rs
rt
Memory
Address
+
Register
4. PC-relative addressing
op
rs
rt
Byte
Halfword
Word
Address
Memory
*4
PC
+
Word
5. Pseudodirect addressing
op
Address
*4
Memory
Word
PC
CMPE325 CH #3
Slide #33
Addressing Modes
1- Register Addressing

A register address field is always 5-bit

jr $31
0
31
0
0
0
8
5-bit register address
register contains address

32-bit address
Memory
add $3, $8,$9
0
8
9
5-bit register
5-bit register
address
address
Register contains operand1
3
0
32
5-bit register
address
Register takes the result
Register contains operand2
CMPE325 CH #3
Slide #34
Addressing Modes
2- Base&Displacement Addressing

43
sw $5, 300 ($7)
16-bit imm
7
5
300
5
5-bit register address
Base register
16
sign-extend
32
32
32-bit base address
I15 .. I0 immediate-value (16-bit)
I15 ..I15 , I15 .. I0
+
(32-bit)
byte-address
CMPE325 CH #3
memory
Slide #35
Addressing Modes
3- Immediate addressing

immediate arithmetic-logic instructions
(addi, andi, ori, slti, lui )
16-bit imm
8
7
5
addi and slti
use
sign-extend
5
300
5-bit register address
register contains data
32
All logical
instructions use
zero-extend
16
I15 .. I0 immediate-value (16-bit)
sign or zero extend
32
I15 ..I15 , I15 .. I0
32-bit register contents
CMPE325 CH #3
+
(32-bit)
result goes to $rt
Slide #36
Addressing Modes
4- PC-relative addressing

beq and bne use PC-relative addressing
ops
rs
rt
immediate-value (16-bit)
16
I15 .. I0 (16-bit PC-relative-wordshift-left-2 and sign- address)
extend
32 I15 .. I15, I15 .. I0 , 0, 0 = (2-bit shifted and
sign-extended
immediate)
Program Counter register
32
+
byte-address
CMPE325 CH #3
Target PC
Slide #37
Addressing Modes
5- Pseudo-Direct addressing

An operand may contain large part of the address
directly.


J and JAL has 26-bit immJ field as direct address.
This field is left shifted-2-bit, and then PC-extended.
opc
Program-Counter register
immediate value (26-bit)
26 I25 .. I0 (26-bit)
(local-word-address)
shift-left-2-bits
28 I , .. , I , 0, 0 = imm. value  22 (28-bit)
25
0
PC31 .. PC28 , (4-bit)
(current-page offset) 4
(local-byte-address)
Concatenate
32
byte address
PC31 .. PC28 , I25 , .. , I0 , 0, 0
PC jump
28-bit byte address is PC-extended to 32-bit.
CMPE325 CH #3
Slide #38
Addressing Modes Summary





Register addressing – operand is a register
(ex. ALU)
Base/displacement addressing – operand is at
the memory location that is the sum of a base
register and a constant (ex. load/store)
Immediate addressing – operand is a constant
within the instruction itself (ex. constants)
PC-relative addressing – address is the sum of
PC and constant in instruction (ex. branch)
Pseudodirect addressing – target address is
concatenation of field in instruction and the PC
(ex. jump)
CMPE325 CH #3
Slide #39
Four Design Principles
1.
2.
3.
4.
Simplicity favors regularity
Smaller is faster
Good design demands good compromises
Make the common case fast
CMPE325 CH #3
Slide #40
CMPE325 CH #3
Slide #41
Download