Uploaded by 김재민/정보통신공학과

slide n3 handout

advertisement
Computer Architecture
Chapter 2
Lecture (Class A): Tue(1A-2A), Wed(7A-8A)
Lecture (Class A): Tue(1A-2A), Wed(7A-8A)
Lecture (Class B): Tue(2B-3B), Wed(5B-6B)
Lecture (Class B): Tue(2B-3B), Wed(5B-6B)
Office Hours:
Tue(4A-4B), Wed(8B-9A)
Office Hours:
Tue(4A-4B), Wed(8B-9A)
This material is for educational uses only. Slides adapted from D. Patterson and M. Irwin, and some contents are based on the material provided by other
paper/book authors and may be copyrighted by them.
Head Up
q Last week’s material
 Review: Number representation and combinational
logic circuit
q This week’s material
 Introduction to MIPS assembler, adds/loads/stores
q Next week’s material
 MIPS control flow operations
Review: Execute Cycle
The datapath executes the instructions
as directed by control
Devices
Processor
Network
Control
000000 00100 00010 0001000000100000
Memory
Input
Datapath
contents Reg #4 ADD contents Reg #2
results put in Reg #2
Output
Memory stores both instructions and data
Review: Processor Organization
q Control needs to have circuitry to
Fetch
 Decide which is the next instruction
and input it from memory
Exec
Decode
 Decode the instruction
 Issue signals that control the way
information flows between datapath components
 Control what operations the datapath’s functional units
perform
q Datapath needs to have circuitry to
 Execute instructions - functional units (e.g., adder) and
storage locations (e.g., register file)
 Interconnect the functional units so that the instructions can
be executed as required
 Load data from and store data to memory
Assembly Language Instructions
q The language of the machine
 Want an ISA that makes it easy to build the
hardware and the compiler while maximizing
performance and minimizing cost
q Stored program (von Neumann) concept
 Instructions are stored in memory (as is the data)
q Our target: the MIPS ISA
 similar to other ISAs developed since the 1980's
 used by Broadcom, Cisco, NEC, Nintendo, Sony, …
Design goals: maximize performance, minimize cost,
reduce design time (time-to-market), minimize memory
space (embedded systems), minimize power
consumption (mobile systems)
RISC - Reduced Instruction Set Computer
q RISC philosophy
 fixed instruction lengths
 load-store instruction sets
 limited number of addressing modes
 limited number of operations
q MIPS, Sun SPARC, HP PA-RISC, IBM
PowerPC …
q Instruction sets are measured by how well
compilers use them as opposed to how well
assembly language programmers use them
q CISC (C for complex), e.g., Intel x86
MIPS Instructions: Addition
q add: mnemonic indicates what operation to
perform
q b, c: source operands on which the operation
is performed
q a: destination operand to which the result is
written
MIPS Instructions: Subtraction
q Subtraction is similar to addition, only
mnemonic changes
q add: mnemonic indicates what operation to
perform
q b, c: source operands on which the operation
is performed
q a: destination operand to which the result is
written
Design Principle 1
Simplicity favors regularity
q Consistent instruction format
q Same number of operands (two sources and
one destination)
 easier to encode and handle in hardware
Instructions: More Complex Code
q More complex code is handled by multiple
MIPS instructions
Design Principle 2
Make the common case fast
q MIPS includes only simple, commonly used
instructions
q Hardware to decode and execute the
instruction can be simple, small, and fast
q More complex instructions (that are less
common) can be performed using multiple
simple instructions
Operands: Registers
q A computer needs a physical location from
which to retrieve binary operands
q A computer retrieves operands from:
 Registers / Memory / Constants (also called
immediates)
 Main memory is slow
 Most architectures have a small set of (fast)
registers (MIPS has thirty-two 32-bit registers)
 MIPS is called a 32-bit architecture because it
operates on 32-bit data
- A 64-bit version of MIPS also exists, but we will consider
only the 32-bit version
Design Principle 3
Smaller is faster
q MIPS includes only a small number of registers
q Just as retrieving data from a few books on
your table is faster than sorting through 1000
books, retrieving data from 32 registers is
faster than retrieving it from 1000 registers or a
large memory
MIPS Arithmetic Instruction
q MIPS assembly language arithmetic statement
add $t0, $s1, $s2
sub $t0, $s1, $s2
q Each arithmetic instruction performs only one
operation
q Each arithmetic instruction specifies exactly
three operands
destination ¬ source1 op source2
 Operand order is fixed (the destination is specified
first)
q The operands are contained in the datapath’s
register file ($t0, $s1, $s2)
Compiling More Complex Statements
q Assuming that
 variable b is stored in register $s1,
 c is stored in $s2,
 d is stored in $s3, and
 the result is to be left in $s0, what is the assembler
equivalent to the C statement
h = (b - c) + d
MIPS Register File
q Operands of arithmetic instructions must be from a
limited number of special locations contained in the
datapath’s register file
 Thirty-two 32-bit registers
- Two read ports
- One write port
q Registers are
 Fast
- Smaller is faster & Make the common case fast
 Easy for a compiler to use
- e.g., (A*B) – (C*D) – (E*F) can do multiplies in any order
 Improves code density
- Since register are named with fewer bits than a memory location
q Register addresses are indicated by using $
Naming Conventions for Registers
0
$zero constant
constant 00 (Hdware)
16 $s0 callee saves
1
$at reserved
reservedfor
forassembler
assembler
...
2
$v0 expression evaluation &
23 $s7
3
$v1 function results
24 $t8 temporary (cont’d)
4
$a0 arguments
25 $t9
5
$a1
26 $k0 reserved
forOS
OSkernel
kernel
reserved for
6
$a2
27 $k1
7
$a3
28 $gp pointer to global area
8
$t0 temporary: caller saves
29 $sp stack pointer
...
(callee can clobber)
30 $fp frame pointer
15 $t7
(caller can clobber)
31 $ra return
(Hdware)
Return address
address (HW)
Operand: Registers
q Written with a dollar sign ($) before their name
 For example, register 0 is written “$0”, pronounced
“register zero” or “dollar zero”
q Certain registers used for specific purposes:
 $0 always holds the constant value 0
 the saved registers, $s0-$s7, are used to hold
variables
 the temporary registers, $t0-$t9, are used to hold
intermediate values during a larger computation
q For now, we only use the temporary registers
($t0-$t9) and the saved registers ($s0-$s7)
q We will use the other registers in later slides
Example
q How to do the following C statement?
 f = (g + h) - (i + j);
f
g
h
i
j
$s0
$s1
$s2
$s3
$s4
 use intermediate temporary register $t0, $t1
? # t0 = g + h
q add $t0,$s1,$s2
? # t1 = i + j
q add $t1,$s3,$s4
? # f=(g+h)-(i+j)
q sub $s0,$t0,$t1
Registers vs. Memory
q Arithmetic instructions operands must be in
registers
 only thirty-two registers are provided
Devices
Processor
Network
Control
Datapath
Memory
Input
Output
q Compiler associates variables with registers
What about programs with lots of variables?
Register Allocation
q Compiler tries to keep as many variables in
registers as possible: graph coloring
q Some variables can not be allocated
 large arrays (too few registers)
 aliased variables (variables accessible through
pointers in C)
 dynamic allocated variables
- heap
- stack
 Compiler may run out of registers ⇒ spilling
Register Allocation Using Graph Coloring
q Example
Register Allocation Using Graph Coloring
Register Allocation: Spilling
q Spill/Reload code
 Spill/Reload code is needed when there are not
enough colors (registers) to color the interference
graph
Example: What if only two
registers available?
?
?
Fundamental Data Types
Logical Memory Organization
q Memory is a large, single-dimension array, with an address.
q A memory address is an index into the array
q "Byte addressing" means that the index points to a byte of
memory.
Physical Memory Organization
Processor – Memory Interconnections
q Memory is a large, single-dimensional array
q An address acts as the index into the memory
array
Memory
read addr/
write addr
Processor
?
locations
read data
write data
10
101
1
32 bits
Data Transfer: Memory to Register (1/3)
q To transfer a word of data, need to specify two
things:
 Register: specify this by number (0 - 31)
 Memory address: more difficult
- Think of memory as a 1D array
- Address it by supplying a pointer to a memory address
- Offset (in bytes) from this pointer
- The desired memory address is the sum of these two
values, e.g., 8($t0)
- Specifies the memory address pointed to by the value in
$t0, plus 8 bytes (why “bytes”, not “words”?)
- Each address is 32 bits
Data Transfer: Memory to Register (2/3)
q Load Instruction Syntax:
 lw $t0,12($s0)
1 2 3 4
 1) operation name
 2) register that will receive value
 3) numerical offset in bytes
 4) register containing pointer to memory
q Example: lw $t0,12($s0)
 lw (Load Word, so a word (32 bits) is loaded at a
time)
 Take the pointer in $s0, add 12 bytes to it, and then
load the value from the memory pointed to by this
calculated sum into register $t0
Data Transfer: Memory to Register (3/3)
q Load Instruction Syntax:
 lw $t0,12($s0)
q Notes:
 $s0 is called the base register, 12 is called the
offset
 Offset is generally used in accessing elements of
array: base register points to the beginning of the
array
Data Transfer: Register to Memory
q Also want to store value from a register into
memory
q Store instruction syntax is identical to Load
instruction syntax
q Example: sw $t0,12($s0)
 sw (meaning Store Word, so 32 bits or one word
are stored at a time)
 This instruction will take the pointer in $s0, add 12
bytes to it, and then store the value from register
$t0 into the memory address pointed to by the
calculated sum
Memory Operand Example 1
q Compile by hand using registers:
 $s1:g, $s2:h, $s3:base address of A
 g = h + A[8];
q What offset in lw to select an array element
A[8] in a C program?
 4x8 = 32 bytes to select A[8]
 1st transfer from memory to register:
 lw
$t0, 32($s3)
# $t0 gets A[8]
 Add 32 to $s3 to select A[8], put into $t0
q Next add it to h and place in g
 add
$s1,$s2,$t0
# $s1 = h+A[8]
Memory Operand Example 2
q C code:
 A[12] = h + A[8];
- h in $s2, base address of A in $s3
q Compiled MIPS code:
 Index 8 requires offset of 32
 lw
 add
 sw
$t0, 32($s3)
$t0, $s2, $t0
$t0, 48($s3)
# load word
# store word
Accessing Memory
q MIPS has two basic data transfer instructions for
accessing memory (assume $s3 holds 2410)
lw
$t0, 4($s3)
#load word from memory
sw
$t0, 8($s3)
#store word to memory
q The data transfer instruction must specify
 where in memory to read from (load) or write to (store)
– memory address
 where in the register file to write to (load) or read from
(store) – register destination (source)
q The memory address is formed by summing the
constant portion of the instruction and the
contents of the second register
MIPS Memory Addressing
q The memory address is formed by summing the
constant portion of the instruction and the
contents of the second (base) register
$s3 holds 8
lw
sw
Memory
$t0, 4($s3)
$t0, 8($s3)
...0110
24
...0101
20
...1100
16
...0001
12
...0010
8
...1000
4
...0100
Data
0
Word Address
#what? is loaded into $t0
#$t0 is stored where?
Compiling with Loads and Stores
q Assuming that
 variable b is stored in $s2 and
 the base address of array A is in $s3,
q What is the MIPS assembly code for the C
statement
A[8] = A[2] - b
...
...
A[3]
$s3+12
A[2]
$s3+8
A[1]
$s3+4
A[0]
$s3
Compiling with a Variable Array Index
...
...
A[3]
$s4+12
A[2]
$s4+8
A[1]
$s4+4
A[0]
$s4
q Assuming that
 the base address of array A is in
register $s4 and
 variables b, c, and i are in $s1, $s2,
and $s3, respectively,
q What is the MIPS assembly code
for the C statement
c = A[i] - b
add
$t1, $s3, $s3
#array index i is in $s3
add
$t1, $t1, $t1
#temp reg $t1 holds 4*i
Registers vs. Memory
q Registers are faster to access than memory
q Operating on memory data requires loads and
stores
 More instructions to be executed
q Compiler must use registers for variables as
much as possible
 Only spill to memory for less frequently used
variables
 Register optimization is important!
Dealing with Constants
q Small constants are used quite frequently
(50% of operands in many common
programs)
e.g.,
A = A + 5;
B = B + 1;
C = C - 18;
q Solutions?
Why not?
 Put “typical constants” in memory and load them
 Create hard-wired registers (like $zero) for
constants like 1, 2, 4, 10, …
q How do we make this work?
q How do we Make the common case fast !
Constant (or Immediate) Operands
q Include constants inside arithmetic instructions
 Much faster than if they have to be loaded from
memory (they come in from memory with the
instruction itself)
q MIPS immediate instructions
addi $s3, $s3, 4
#$s3 = $s3 + 4
There is no subi instruction, can you guess why
not?
Immediate Operands
q No subtract immediate instruction in MIPS
 ISA design principle: limit types of operation that
can be done to minimum
 If an operation can be decomposed into a simpler
operation, do not include it
 addi …, -X = subi .., X => so no subi
q Example
 C code: f = g – 10
 MIPS code: addi $s0, $s1, -10
The Constant Zero
q The number zero (0), appears very often in
code; so we define register zero
q MIPS register 0 ($zero) is the constant 0
 Cannot be overwritten
 This is defined in hardware, so an instruction like
 addi $0,$0,5 will not do anything
q Useful for common operations
 E.g., move between registers
 add $t2, $s1, $zero
MIPS Instructions, so far
Category
Instr
Arithmetic add
Data
transfer
Example
Meaning
add $s1, $s2, $s3 $s1 = $s2 + $s3
subtract
sub $s1, $s2, $s3 $s1 = $s2 - $s3
add
immediate
addi $s1, $s2, 4
$s1 = $s2 + 4
load word
lw
$s1, 32($s2)
$s1 = Memory($s2+32)
store word sw $s1, 32($s2)
Memory($s2+32) = $s1
Review: MIPS Organization
q Arithmetic instructions – to/from the register file
q Load/store instructions - to/from memory
Memory
Processor
1…1100
Register File
src1 addr
src1
data
32
5
src2 addr
32
5
registers
dst addr
($zero - $ra)
src2
5
write data
data
32
32
32 bits
32 ALU
32
read/write
addr
230
words
32
read data
32
write data
32
32
4
0
byte address
(big Endian)
5
1
6
2
32 bits
7
3
0…1100
0…1000
0…0100
0…0000
word address
(binary)
Review: Unsigned Binary Representation
Hex
Binary
Decimal
0x00000000
0…0000
0
0x00000001
0…0001
1
0x00000002
0…0010
2
231 230 229
0x00000003
0…0011
3
31 30 29
...
3
0x00000004
0…0100
4
1 1 1
...
1 1 1 1
bit
0x00000005
0…0101
5
0x00000006
0…0110
6
0x00000007
0…0111
7
1 0 0 0
...
0 0 0 0
-
0x00000008
0…1000
8
0x00000009
0…1001
9
…
0xFFFFFFFC 1…1100
232 - 4
0xFFFFFFFD 1…1101
232 - 3
0xFFFFFFFE
1…1110
0xFFFFFFFF
1…1111
232 - 2
232 - 1
...
23 22 21
20
bit weight
0
bit position
232 - 1
2
1
1
Review: Signed Binary
Representation
2’sc binary
decimal
-23 =
1000
-8
-(23 - 1) =
1001
-7
1010
-6
1011
-5
1100
-4
1101
-3
1110
-2
1111
-1
0000
0
0001
1
0010
2
0011
3
0100
4
0101
5
0110
6
0111
7
complement all the bits
0101
and add a 1
1011
0110
and add a 1
1010
complement all the bits
23 - 1 =
Machine Language - Arithmetic Instruction
q Instructions, like registers and words of data,
are also 32 bits long
 Example:
add $t0, $s1, $s2
registers have numbers $t0=$8,$s1=$17,$s2=$18
q Instruction Format:
op
rs
000000 10001
rt
rd
shamt
funct
10010
01000
00000
100000
Can you guess what the field names stand for?
MIPS Instruction Fields
op
rs
rt
rd
shamt
funct
6 bits
5 bits
5 bits
5 bits
5 bits
6 bits
q op
q rs
q rt
q rd
q shamt
q funct
= 32 bits
Machine Language - Load Instruction
q Consider the load-word and store-word instr’s
 What would the regularity principle have us do?
- But . . . Good design demands compromise
q Introduce a new type of instruction format
 I-type for data transfer instructions (previous format
was R-type for register)
q Example:
op
lw $t0, 24($s2)
rs
rt
23hex
18
100011
10010
16 bit number
8
01000
24
0000000000011000
Where's the compromise?
Memory Address Location
q Example:
lw $t0, 24($s2)
Memory
0xf f f f f f f f
2410 + $s2 =
0x00000002
0x12004094
$s2
Note that the offset
can be positive or
negative
0x120040ac
data
0x0000000c
0x00000008
0x00000004
0x00000000
word address (hex)
Machine Language - Store Instruction
q Example:
sw $t0, 24($s2)
op
rs
rt
43
18
8
101011
10010
01000
16 bit number
24
0000000000011000
q A 16-bit offset means access is limited to
memory locations within a range of +213-1 to
-213 (~8,192) words (+215-1 to -215 (~32,768)
bytes) of the address in the base register $s2
 2’s complement (1 sign bit + 15 magnitude bits)
Machine Language – Immediate Instructions
q What instruction format is used for the addi ?
addi $s3, $s3, 4 #$s3 = $s3 + 4
q Machine format:
Instruction Format Encoding
q Can reduce the complexity with multiple formats
by keeping them as similar as possible
 First three fields are the same in R-type and I-type
q Each format has a distinct set of values in the
op field
Instr
Frmt
op
rs
rt
rd
shamt funct
address
add
R
0
reg
reg reg 0
32ten NA
sub
R
0
reg
reg reg 0
34ten NA
addi
I
8ten
reg
reg NA NA
NA
constant
lw
I
35ten reg
reg NA NA
NA
address
sw
I
43ten reg
reg NA NA
NA
address
Assembling Code
q Remember the assembler code we compiled
last lecture for the C statement
A[8] = A[2] - b
lw
sub
sw
$t0, 8($s3)
$t0, $t0, $s2
$t0, 32($s3)
#load A[2] into $t0
#subtract b from A[2]
#store result in A[8]
q Assemble the MIPS object code for these three
instructions (decimal is fine)
lw
sub
sw
Review: MIPS Instructions, so far
Category
Instr
Op
Code
Example
Meaning
Arithmetic add
(R format)
subtract
0&
32
add $s1, $s2, $s3 $s1 = $s2 + $s3
0&
34
sub $s1, $s2, $s3 $s1 = $s2 - $s3
Arithmetic add
(I format) immediate
8
addi $s1, $s2, 4
$s1 = $s2 + 4
Data
transfer
(I format)
load word
35
lw $s1, 100($s2)
$s1 = Memory($s2+100)
store word
43
sw $s1, 100($s2)
Memory($s2+100) = $s1
Review: MIPS R3000 ISA
q Instruction Categories
Registers
Load/Store
 Computational
 Jump and Branch
 Floating Point

R0 - R31
- coprocessor
PC
HI
Memory Management
 Special

LO
q 3 Instruction Formats:
all 32 bits wide
6 bits
5 bits
5 bits
5 bits
5 bits
6 bits
OP
rs
rt
rd
shamt
funct
OP
rs
rt
16 bit number
OP
26 bit jump target
R format
I format
Download