CS152: Computer Architecture and Engineering

advertisement
Designing a Single Cycle Datapath
In this lecture, slides from lectures 3, 8 and 9 from the course
Computer Architecture ECE 201 by Professor Mike Schulte are used
with permission.
The Big Picture: Where are We Now?
° The Five Classic Components of a Computer
Processor
Input
Control
Memory
Datapath
Output
° Today’s Topic: Design a Single Cycle Processor
The Big Picture: The Performance Perspective
° Performance of a machine is determined by:
CPI
• Instruction count
• Clock cycle time
• Clock cycles per instruction
Inst. Count
Cycle Time
° Processor design (datapath and control) will determine:
• Clock cycle time
• Clock cycles per instruction
° Single cycle processor - one clock cycle per instruction
• Advantages: Simple design, low CPI
• Disadvantages: Long cycle time, which is limited by
the slowest instruction.
How to Design a Processor: step-by-step
1. Analyze instruction set => datapath requirements
• the meaning of each instruction is given by register
transfers
R[rd] <– R[rs] + R[rt];
• datapath must include storage element for ISA
registers
• datapath must support each register transfer
2. Select set of datapath components and establish
clocking methodology
3. Design datapath to meet the requirements
4. Analyze implementation of each instruction to determine
setting of control points that effects the register transfer.
5. Design the control logic
MIPS Instruction set
MIPS operands
Name
Example
$s0-$s7, $t0-$t9, $zero,
32 registers $a0-$a3, $v0-$v1, $gp,
$fp, $sp, $ra, $at
Memory[0],
232 memory Memory[4], ...,
words
Accessed only by data transfer instructions. MIPS uses byte addresses, so
sequential w ords differ by 4. Memory holds data structures, such as arrays,
and spilled registers, such as those saved on procedure calls.
Memory[4294967292]
add
MIPS assembly language
Example
Meaning
add $s1, $s2, $s3
$s1 = $s2 + $s3
Three operands; data in registers
subtract
sub $s1, $s2, $s3
$s1 = $s2 - $s3
Three operands; data in registers
$s1 = $s2 + 100
$s1 = Memory[$s2 + 100]
Memory[$s2 + 100] = $s1
$s1 = Memory[$s2 + 100]
Memory[$s2 + 100] = $s1
Used to add constants
Category
Arithmetic
Comments
Fast locations for data. In MIPS, data must be in registers to perform
arithmetic. MIPS register $zero alw ays equals 0. Register $at is
reserved for the assembler to handle large constants.
Instruction
addi $s1, $s2, 100
lw $s1, 100($s2)
sw $s1, 100($s2)
store word
lb $s1, 100($s2)
load byte
sb $s1, 100($s2)
store byte
load upper immediate lui $s1, 100
add immediate
load word
Data transfer
Conditional
branch
Unconditional jump
$s1 = 100 * 2
16
Comments
Word from memory to register
Word from register to memory
Byte from memory to register
Byte from register to memory
Loads constant in upper 16 bits
branch on equal
beq
$s1, $s2, 25
if ($s1 == $s2) go to
PC + 4 + 100
Equal test; PC-relative branch
branch on not equal
bne
$s1, $s2, 25
if ($s1 != $s2) go to
PC + 4 + 100
Not equal test; PC-relative
set on less than
slt
$s1, $s2, $s3
if ($s2 < $s3) $s1 = 1;
else $s1 = 0
Compare less than; for beq, bne
set less than
immediate
slti
jump
j
jr
jal
jump register
jump and link
$s1, $s2, 100 if ($s2 < 100) $s1 = 1;
Compare less than constant
else $s1 = 0
2500
$ra
2500
Jump to target address
go to 10000
For switch, procedure return
go to $ra
$ra = PC + 4; go to 10000 For procedure call
Review: The MIPS Instruction Formats
° All MIPS instructions are 32 bits long. The three
instruction formats are:
31
• R-type
26
op
rs
6 bits
• I-type
• J-type
31
21
op
5 bits
11
6
0
rd
shamt
funct
5 bits
5 bits
6 bits
16
0
immediate
rt
5 bits
16 bits
26
op
6 bits
° The different fields are:
•
•
•
•
•
•
5 bits
21
rs
6 bits
31
rt
5 bits
26
16
0
target address
26 bits
op: operation of the instruction
rs, rt, rd: the source and destination register specifiers
shamt: shift amount
funct: selects the variant of the operation in the “op” field
address / immediate: address offset or immediate value
target address: target address of the jump instruction
Translating MIPS Assembly into Machine Language
° Humans see instructions as words (assembly language),
but the computer sees them as ones and zeros (machine
language).
° An assembler translates from assembly language to
machine language.
° For example, the MIPS instruction add $t0, $s1, $s2 is
translated as follows (see back of book):
Assembly
Comment
add
op = 0, shamt = 0, funct = 32
$t0
rd = 8
$s1
rs = 17
$s2
rt = 18
000000
op
10001 10010
rs
rt
01000 00000
rd
shamt
100000
funct
MIPS Addressing Modes
° Addressing modes specify where the data used by an instruction is
located.
mode
example
action
register direct
add $s1, $s2, $s3
$s1 = $s2 + $s3
immediate
addi $s1, $s2, 200
$s1 = $s2 + 200
base+index
lw
$s1 = mem[200 + $s2]
PC-relative
beq $s1, $s2, 200
$s1, 200($s2)
if ($s1 == $s2)
PC = PC+4+200*4
Pseudo-direct
j 4000
PC = (PC[31:28], 4000*4)
° Often, the type of addressing mode depends on the type of operation
being performed (e.g., branches all use PC relative)
° A summary of MIPS addressing modes is given on the back cover of
the book.
MIPS Addressing Modes/Instruction Formats
• All MIPS instructions are 32 bits wide - fixed length
Register (direct)
op
rs
rt
add $s1, $s2, $s3
rd
register
Immediate
Base+index
op
rs
rt
immed
op
rs
rt
immed
register
addi $s1, $s2, 200
Memory
+
lw $s1, 200($s2)
PC-relative
op
rs
PC
rt
immed
Memory
+
beq $s1, $s2, 200
Step 1a: The MIPS Subset for Today
° ADD and SUB
31
op
• addu rd, rs, rt
• subu rd, rs, rt
° OR Immediate:
• ori rt, rs, imm16
° LOAD and STORE
26
op
31
• beq rs, rt, imm16
5 bits
21
26
op
6 bits
5 bits
16
5 bits
0
rd
shamt
funct
5 bits
5 bits
6 bits
0
16 bits
0
immediate
5 bits
21
rs
6
immediate
rt
5 bits
11
16
rt
rs
6 bits
5 bits
21
rs
6 bits
31
26
16
rt
5 bits
26
op
• lw rt, rs, imm16
• sw rt, rs, imm16
° BRANCH:
rs
6 bits
31
21
16 bits
16
rt
5 bits
0
immediate
16 bits
Register Transfer Logic (RTL)
° RTL gives the meaning of the instructions
° All instructions start by fetching the instruction
op | rs | rt | rd | shamt | funct = MEM[ PC ]
op | rs | rt | Imm16
= MEM[ PC ]
inst
Register Transfers
addu
R[rd] <– R[rs] + R[rt];
PC <– PC + 4
subu
R[rd] <– R[rs] – R[rt];
PC <– PC + 4
ori
R[rt] <– R[rs] + zero_ext(imm16);
PC <– PC + 4
load
R[rt] <– MEM[ R[rs] + sign_ext(imm16)];
PC <– PC + 4
store
MEM[ R[rs] + sign_ext(imm16) ] <– R[rt];
PC <– PC + 4
beq
if ( R[rs] == R[rt] ) then PC <– PC + 4 + sign_ext(imm16)] || 00
else PC <– PC + 4
Step 1: Requirements of the Instruction Set
° Memory
• instruction & data
° Registers (32 x 32)
• read rs
• read rt
• write rt or rd
° PC
° Extender (sign extend or zero extend)
° Add and sub register or extended immediate
° Add 4 or shifted extended immediate to PC
Step 2: Components of the Datapath
° Adder
CarryIn
A
Adder
B
32
32
Sum
Carry
32
Select
° MUX
B
32
MUX
A
32
° ALU
OP
A
32
32
ALU
B
32
Y
3
32
Result
Combinational Logic:
Does not use a clock
Storage Element: Register (Basic Building Blocks)
° Register
• Similar to the D Flip Flop except
- N-bit input and output
Write Enable
Data In
N
Data Out
N
- Write enable input
• Write Enable:
Clk
- negated (0): Data Out will not change
- asserted (1): Data Out will become Data In
on the falling edge of the clock
Clocking Methodology - Negative Edge Triggered
Clk
Setup
Hold
Setup
Hold
.
.
.
.
.
.
Don’t Care
.
.
.
.
.
.
° All storage elements are clocked by the same clock edge
° Cycle Time = CLK-to-Q + Longest Delay Path + Setup + Clock
Skew
Storage Element: Register File
° Register File consists of 32 registers: Write Enable RW RARB
5 5 5
• Two 32-bit output busses:
busA
busA and busB
busW
32
32 32-bit
• One 32-bit input bus: busW
32
Registers busB
Clk
° Register is selected by:
32
• RA (number) selects the register to put on busA (data)
• RB (number) selects the register to put on busB (data)
• RW (number) selects the register to be written
via busW (data) when Write Enable is 1
° Clock input (CLK)
• The CLK input is a factor ONLY during write operation
• During read operation, behaves as a combinational logic
block:
- RA or RB valid => busA or busB valid after “access
time.”
Register File - Read
° Built using D flip-flops
Read register
number 1
Register 0
Register 1
Read register
number 1
Read register
number 2
Write
register
Write
data
Register file
Read
data 1
Read
data 2
M
...
u
Register n – 2
x
Read data 1
Register n – 1
Read register
number 2
Write
M
u
x
Read data 2
Register File - write
Write
C
0
1
Register number
n-to-2n
decoder
Register 0
.
..
D
C
Register 1
n–1
n
D
..
.
C
Register n – 2
D
C
Register n – 1
Register data
D
Storage Element: Idealized Memory
° Memory (idealized)
Write Enable
• One input bus: Data In
• One output bus: Data Out
° Memory word is selected by:
Data In
32
Clk
Address
32
DataOut
32
• Address selects the word to put on Data Out
• Write Enable = 1: address selects the memory
word to be written via the Data In bus
° Clock input (CLK)
• The CLK input is a factor ONLY during write operation
• During read operation, memory behaves as a
combinational logic block:
-
Address valid => Data Out valid after “access time.”
Step 3
° Register Transfer Requirements –> Datapath Design
• Instruction Fetch
• Decode instructions and Read Operands
• Execute Operation
• Write back the result
3a: Overview of the Instruction Fetch Unit
° The common RTL operations
• Fetch the Instruction: mem[PC]
• Update the program counter:
-
Sequential Code: PC <- PC + 4
-
Branch and Jump: PC <- “something else”
Clk
PC
Next Address
Logic
Address
Instruction
Memory
Instruction Word
32
3b: Add & Subtract
° R[rd] <- R[rs] op R[rt]
Example: addu
rd, rs, rt
• Ra, Rb, and Rw come from instruction’s rs, rt, and rd fields
• ALUctr and RegWr: control logic after decoding the
instruction
31
26
21
op
16
rs
6 bits
rt
5 bits
Rd Rs
RegWr 5
5
5 bits
Rt
rd
shamt
funct
5 bits
5 bits
6 bits
busA
32
busB
32
ALU
32
Clk
32 32-bit
Registers
6
ALUctr
3
5
Rw Ra Rb
busW
11
Result
32
0
3c: Logical Operations with Immediate
° R[rt] <- R[rs] op ZeroExt[imm16]
31
26
op
Example : ori rt, rs, imm16
21
rs
6 bits
rt
5 bits
RegDst
0
immediate
5 bits
31
Rd
11
16
16 15
16 bits
rd?
0
immediate
0000000000000000
16 bits
Rt
16 bits
Mux
RegWr 5
32
Clk
ALUctr
3
busA
Rw Ra Rb
32
32 32-bit
Registers
ALU
busW
Rs
5
5
busB
16
ZeroExt
imm16
Mux
32
32
ALUSrc
Result
32
3d: Load Operations
° R[rt] <- Mem[R[rs] + SignExt[imm16]]
31
26
op
rs
6 bits
Rd
Rt
RegDst
Mux
RegWr 5
32
Clk
rt
5 bits
Rs
5
5
16 bits
rd
ALUctr
3
busA
W_Src
32
ALUSrc
Mux
Extender
32
ExtOp
32
MemWr
Mux
busB
32
16
0
immediate
5 bits
Rw Ra Rb
32 32-bit
Registers
imm16
11
16
ALU
busW
21
Example: lw rt, rs, imm16
WrEn Adr
Data In
32
Clk
Data
Memory
32
3e: Store Operations
° Mem[ R[rs] + SignExt[imm16] <- R[rt] ] Example: sw rt, rs, imm16
31
26
21
op
rs
6 bits
Rd
RegDst
16
0
rt
5 bits
immediate
5 bits
Rt
16 bits
ALUctr
MemWr
W_Src
Mux
RegWr 5
32
Clk
5
5
Rw Ra Rb
32 32-bit
Registers
32
32
ExtOp
32
Mux
busB
32
Mux
16
busA
Extender
imm16
3
Rt
ALU
busW
Rs
WrEn Adr
Data In 32
Clk
ALUSrc
Data
Memory
32
3f: The Branch Instruction
31
26
op
6 bits
21
rs
5 bits
16
rt
5 bits
0
immediate
16 bits
° beq rs, rt, imm16
• mem[PC]
Fetch the instruction from memory
• Equal <- R[rs] == R[rt]
Calculate the branch condition
• if (COND eq 0)
-
Calculate the next instruction’s address
PC <- PC + 4 + ( SignExt(imm16) x 4 )
else
-
PC <- PC + 4
Datapath for Branch Operations
rs, rt, imm16
31
Datapath generates condition (equal)
26
21
op
rs
6 bits
16
rt
5 bits
0
immediate
5 bits
16 bits
Inst Address
nPC_sel
4
Adder
RegWr 5
00
32
PC
Mux
Adder
PC Ext
imm16
Cond
Clk
busW
Clk
Rs
5
Rt
5
Rw Ra Rb
32 32-bit
Registers
busA
32
busB
32
Equal?
° beq
Putting it All Together: A Single Cycle Datapath
nPC_sel
RegDst
00
Rs
5
Rt
5
imm16
16
Extender
Clk
=
32
0
1
32
Data In
32
ExtOp
Clk
ALUSrc
32
0
Mux
busA
Rw Ra Rb
32 32-bit
Registers
busB
32
Clk
MemtoReg
3
Mux
PC
Mux
Adder
PC Ext
imm16
ALUctr MemWr
Equal
ALU
Adder
32
Imm16
0
RegWr 5
busW
Rd
Rd Rt
1
4
Rt
Instruction<31:0>
<0:15>
Rs
<11:15>
Adr
<16:20>
<21:25>
Inst
Memory
WrEn Adr
Data
Memory
1
Step 4: Given Datapath: RTL -> Control
Instruction<31:0>
Rd
<0:15>
Rs
<11:15>
Rt
<16:20>
Op Fun
<21:25>
<0:5>
Adr
<26:31>
Inst
Memory
Imm16
Control
nPC_selRegWr RegDst ExtOp ALUSrc ALUctr MemWr MemtoReg
DATA PATH
Equal
A Single Cycle Datapath
° We have everything except control signals (underlined)
• Today’s lecture will look at how to generate the control signals
Instruction<31:0>
1 Mux 0
RegWr 5
Rs
5
Rt
Rt
ALUctr
5
busA
0
1
32
MemtoReg
0
32
WrEn Adr
Data In 32
Clk
Imm16
MemWr
32
ALUSrc
ExtOp
Rd
Data
Memory
Mux
16
Extender
imm16
32
Mux
32
Clk
Rw Ra Rb
32 32-bit
Registers
busB
32
ALU
busW
Zero
Rs
<0:15>
Clk
<11:15>
RegDst
Rt
<16:20>
Rd
Instruction
Fetch Unit
<21:25>
nPC_sel
1
Meaning of the Control Signals
° MemWr:
write memory
° ExtOp:
“zero”, “sign”
° ALUsrc:
0 => regB; 1 => immed
° RegDst:
0 => “rt”; 1 => “rd”
° ALUctr:
“add”, “sub”, “or”
° RegWr:
write dest register
RegDst
Rs
5
Rt
5
32
1
32
Data In
32
ExtOp
Clk
ALUSrc
32
0
Mux
0
ALU
16
Extender
imm16
=
Mux
busA
Rw Ra Rb
32 32-bit
Registers
busB
32
Clk
MemtoReg
3
0
RegWr 5
32
ALUctr MemWr
Equal
Rd Rt
1
busW
° MemtoReg: 0 => ALU; 1 => Mem
WrEn Adr
Data
Memory
1
RTL: The Add Instruction
31
26
op
6 bits
21
rs
5 bits
16
rt
5 bits
11
6
0
rd
shamt
funct
5 bits
5 bits
6 bits
° add rd, rs, rt
• mem[PC]
Fetch the instruction
from memory
• R[rd] <- R[rs] + R[rt]
The actual operation
• PC <- PC + 4
Calculate the next
instruction’s address
The Single Cycle Datapath during Add/Sub
31
26
21
op
rs
16
rt
11
6
rd
shamt
° R[rd] <- R[rs] op R[rt]
<0:15>
Imm16
MemtoReg = 0
Zero
1
32
0
32
32
WrEn Adr
Data In 32
Clk
ALUSrc = 0
ExtOp = x
MemWr = 0
Data
Memory
Mux
Extender
16
Rd
5
busA
Rw Ra Rb
32
32 32-bit
Registers
busB
0
32
imm16
Rs
<11:15>
Rt
<16:20>
5
ALUctr = Add
Rt
Mux
32
Clk
Rs
ALU
busW
Clk
1 Mux 0
RegWr = 1 5
Instruction
Fetch Unit
<21:25>
RegDst = 1
Rt
funct
Instruction<31:0>
nPC_sel= +4
Rd
0
1
Instruction Fetch Unit at the End of Add
° PC <- PC + 4
• This is the same for all instructions except Branch and Jump
Inst
Memory
Adr
nPC_sel = +4
4
00
Adder
imm16
PC
Mux
Adder
Clk
Instruction<31:0>
The Single Cycle Datapath during Load
31
26
21
op
rs
16
0
rt
immediate
° R[rt] <- Data Memory {R[rs] + SignExt[imm16]}
Instruction<31:0>
5
ALUctr
= Add
Rt
5
busA
Rw Ra Rb
32
32 32-bit
Registers
busB
0
32
Rt
Zero
32
Imm16
MemtoReg = 1
MemWr = 0
0
1
WrEn Adr
Data In 32
Clk
ALUSrc = 1
ExtOp = 1
Rd
Mux
ALU
16
Extender
imm16
1
Rs
32
Mux
32
Clk
Rs
<0:15>
1 Mux 0
RegWr = 1 5
busW
Clk
<11:15>
RegDst = 0
Rt
<16:20>
Rd
Instruction
Fetch Unit
<21:25>
nPC_sel= +4
Data
Memory
32
The Single Cycle Datapath during Store
31
26
21
op
rs
16
0
rt
immediate
° Data Memory {R[rs] + SignExt[imm16]} <- R[rt]
Instruction<31:0>
5
ALUctr
= Add
Rt
5
Zero
Imm16
MemWr = 1
0
32
32
WrEn Adr
Data In 32
32
Clk
ALUSrc = 1
ExtOp = 1
Rd
MemtoReg = x
ALU
16
Extender
imm16
1
Rs
Data
Memory
Mux
busA
Rw Ra Rb
32
32 32-bit
Registers
busB
0
32
Rt
Mux
32
Clk
Rs
<0:15>
1 Mux 0
RegWr = 0 5
busW
Clk
<11:15>
RegDst = x
Rt
<16:20>
Rd
Instruction
Fetch Unit
<21:25>
nPC_sel= +4
1
The Single Cycle Datapath during Branch
31
26
21
op
rs
16
0
rt
immediate
° if (R[rs] - R[rt] == 0) then Zero <- 1 ; else Zero <- 0
5
5
ALUctr =
Subtract
Rt
5
Zero
ALU
16
Extender
imm16
1
32
Rd
0
32
WrEn Adr
Data In 32
Clk
Imm16
MemtoReg = x
MemWr = 0
32
ALUSrc = 0
ExtOp = x
Rs
Data
Memory
Mux
busA
Rw Ra Rb
32
32 32-bit
Registers
busB
0
32
Rt
Mux
32
Clk
Rs
<0:15>
1 Mux 0
RegWr = 0
busW
Clk
<11:15>
RegDst = x
Rt
<16:20>
Rd
Instruction
Fetch Unit
<21:25>
nPC_sel= “Br”
Instruction<31:0>
1
Instruction Fetch Unit at the End of Branch
31
26
21
op
rs
16
rt
0
immediate
° if (Zero == 1) then PC = PC + 4 + SignExt[imm16]*4 ;
Inst
Memory
else PC = PC + 4
Instruction<31:0>
Adr
nPC_sel
See book for what the datapath and
control looks like for jump instructions.
4
00
Adder
imm16
PC
Mux
Adder
Clk
Compared to book our processor also
supports the ORI instructions.
A Summary of the Control Signals
See
Appendix A
We Don’t Care :-)
func 10 0000 10 0010
op 00 0000 00 0000 00 1101 10 0011 10 1011 00 0100 00 0010
add
sub
ori
lw
sw
beq
jump
RegDst
1
1
0
0
x
x
x
ALUSrc
0
0
1
1
1
0
x
MemtoReg
0
0
0
1
x
x
x
RegWrite
1
1
1
1
0
0
0
MemWrite
0
0
0
0
1
0
0
nPCsel
0
0
0
0
0
1
0
Jump
0
0
0
0
0
0
1
ExtOp
x
x
0
1
1
x
x
Add
Subtract
Or
Add
Add
Subtract
xxx
ALUctr<2:0>
31
26
21
16
R-type
op
rs
rt
I-type
op
rs
rt
J-type
op
11
rd
6
shamt
immediate
target address
0
funct
add, sub
ori, lw, sw, beq
jump
Step 5: The Concept of Local Decoding
op
00 0000
00 1101 10 0011 10 1011 00 0100 00 0010
R-type
ori
lw
sw
beq
jump
RegDst
1
0
0
x
x
x
ALUSrc
0
1
1
1
0
x
MemtoReg
0
0
1
x
x
x
RegWrite
1
1
1
0
0
0
MemWrite
0
0
0
1
0
0
Branch
0
0
0
0
1
0
Jump
0
0
0
0
0
1
ExtOp
x
0
1
1
x
x
“R-type”
Or
Add
Add
Subtract
xxx
ALUop<N:0>
func
op
6
Main
Control
6
ALUop
ALUctr
3
ALU
N
ALU
Control
(Local)
The Encoding of ALUop
func
op
6
Main
Control
ALU
Control
(Local)
6
ALUop
N
ALUctr
3
° In this exercise, ALUop has to be N=2 bits wide to represent:
• (1) “R-type” instructions
• “I-type” instructions that require the ALU to perform:
-
(2) Or, (3) Add, and (4) Subtract
° To implement the full MIPS ISA, ALUop has to be 3 bits to represent:
• (1) “R-type” instructions
• “I-type” instructions that require the ALU to perform:
-
(2) Or, (3) Add, (4) Subtract (5) And (6) Set on <
ALUop (Symbolic)
ALUop<2:0>
R-type
ori
lw
sw
“R-type”
Or
Add
0 10
0 00
1 00
beq
jump
Add
Subtract
xxx
0 00
0 01
xxx
The Decoding of the “func” Field
func
op
6
ALU
Control
(Local)
6
ALUop
Main
Control
N
ALUop (Symbolic)
R-type
ori
lw
sw
“R-type”
Or
Add
0 10
0 00
1 00
26
op
21
rs
Get func from back of book for R-type
funct<5:0>
3
R-type
ALUop<2:0>
31
ALUctr
Instruction Operation
16
rt
beq
jump
Add
Subtract
xxx
0 00
0 01
xxx
11
rd
6
shamt
0
funct
Our processor only implements subset of
operations
ALUctr<2:0>
ALU Operation
10 0000
add
000
Add
10 0010
subtract
001
Subtract
10 0100
and
010
And
10 0101
or
110
Or
10 1010
set-on-less-than
111
Set-on-less-than
The Truth Table for ALUctr<2>
R-type
ALUop
(Symbolic) “R-type”
ALUop<2:0>
1 00
funct<3:0>
Instruction Op.
0000
add
beq
0010
subtract
ori
lw
sw
Or
Add
Add
Subtract
0100
and
0 10
0 00
0 00
0 01
0101
or
1010
set-on-less-than
ALUop
func
bit<2> bit<1> bit<0>
bit<3> bit<2> bit<1> bit<0>
ALU
Operation
ALUctr
bit<2> bit<1> bit<0>
0
0
0
x
x
x
x
Add
0
1
0
0
x
1
x
x
x
x
Subtract
1
1
0
0
1
x
x
x
x
x
Or
0
0
1
1
x
x
0
0
0
0
Add
0
1
0
1
x
x
0
0
1
0
Subtract
1
1
0
1
x
x
0
1
0
0
And
0
0
0
1
x
x
0
1
0
1
Or
0
0
1
1
x
x
1
0
1
0
Set on <
1
1
1
This control is for more R-type instructions than our processor, but fewer than the entire MIPS ISA.
The Logic Equation for ALUctr<2>
ALUop
func
bit<2> bit<1> bit<0>
bit<3> bit<2> bit<1> bit<0>
ALUctr<2>
0
x
1
x
x
x
x
1
1
x
x
0
0
1
0
1
1
x
x
1
0
1
0
1
° ALUctr<2> = !ALUop<2> & ALUop<0>
+ ALUop<2> & func<1>
The Truth Table for ALUctr <1>
ALUop
func
bit<2> bit<1> bit<0>
bit<3> bit<2> bit<1> bit<0>
ALU
Operation
ALUctr
bit<2> bit<1> bit<0>
0
0
0
x
x
x
x
Add
0
1
0
0
x
1
x
x
x
x
Subtract
1
1
0
0
1
x
x
x
x
x
Or
0
0
1
1
x
x
0
0
0
0
Add
0
1
0
1
x
x
0
0
1
0
Subtract
1
1
0
1
x
x
0
1
0
0
And
0
0
0
1
x
x
0
1
0
1
Or
0
0
1
1
x
x
1
0
1
0
Set on <
1
1
1
The Logic Equation for ALUctr<1>
ALUop
func
bit<2> bit<1> bit<0>
bit<3> bit<2> bit<1> bit<0> ALUctr<1>
0
0
0
x
x
x
x
1
0
x
1
x
x
x
x
1
1
x
x
0
0
0
0
1
1
x
x
0
0
1
0
1
1
x
x
1
0
1
0
1
° ALUctr<1> = !ALUop<2> & !ALUop<1>
+ ALUop<2> & func<2>
The Truth Table for ALUctr<0>
ALUop
func
bit<2> bit<1> bit<0>
bit<3> bit<2> bit<1> bit<0>
ALU
Operation
ALUctr
bit<2> bit<1> bit<0>
0
0
0
x
x
x
x
Add
0
1
0
0
x
1
x
x
x
x
Subtract
1
1
0
0
1
x
x
x
x
x
Or
0
0
1
1
x
x
0
0
0
0
Add
0
1
0
1
x
x
0
0
1
0
Subtract
1
1
0
1
x
x
0
1
0
0
And
0
0
0
1
x
x
0
1
0
1
Or
0
0
1
1
x
x
1
0
1
0
Set on <
1
1
1
The Logic Equation for ALUctr<0>
ALUop
func
bit<2> bit<1> bit<0>
bit<3> bit<2> bit<1> bit<0> ALUctr<0>
0
1
x
x
x
x
x
1
1
x
x
0
1
0
1
1
1
x
x
1
0
1
0
1
° ALUctr<0> = !ALUop<2> & ALUop<1>
+ ALUop< 2> & func<2> & func<0>
+ ALUop<2> & func<3>
The ALU Control Block
func
6
ALUop
3
ALU
Control
(Local)
ALUctr
3
° ALUctr<2> = !ALUop<2> & ALUop<0>
+ ALUop<2> & func<1>
° ALUctr<1> = !ALUop<2> & !ALUop<1>
+ ALUop<2> & func<2>
° ALUctr<0> = !ALUop<2> & ALUop<1>
+ ALUop< 2> & func<2> & func<0>
+ ALUop<2> & func<3>
The “Truth Table” for the Main Control
RegDst
func
ALUSrc
op
6
Main
Control
ALU
Control
(Local)
6
:
ALUop
ALUctr
3
3
op
00 0000
00 1101 10 0011 10 1011 00 0100 00 0010
R-type
ori
lw
sw
beq
jump
RegDst
1
0
0
x
x
x
ALUSrc
0
1
1
1
0
x
MemtoReg
0
0
1
x
x
x
RegWrite
1
1
1
0
0
0
MemWrite
0
0
0
1
0
0
Branch
0
0
0
0
1
0
Jump
0
0
0
0
0
1
ExtOp
x
0
1
1
x
x
“R-type”
Or
Add
Add
Subtract
xxx
ALUop <2>
1
0
0
0
0
x
ALUop <1>
0
1
0
0
0
x
ALUop <0>
0
0
0
0
1
x
ALUop (Symbolic)
The “Truth Table” for RegWrite
op
00 0000
00 1101 10 0011 10 1011 00 0100 00 0010
R-type
ori
lw
sw
beq
jump
1
1
1
0
0
0
RegWrite
° RegWrite = R-type + ori + lw
= !op<5> & !op<4> & !op<3> & !op<2> & !op<1> & !op<0> (R-type)
+ !op<5> & !op<4> & op<3> & op<2> & !op<1> & op<0> (ori)
+ op<5> & !op<4> & !op<3> & !op<2> & op<1> & op<0> (lw)
..
op<5>
..
op<5>
<0>
R-type
..
op<5>
<0>
ori
..
op<5>
<0>
lw
..
<0>
sw
..
op<5>
op<5>
<0>
beq
op<0>
jump
RegWrite
PLA Implementation of the Main Control
..
op<5>
..
op<5>
<0>
R-type
..
op<5>
<0>
ori
..
op<5>
<0>
lw
..
<0>
sw
..
op<5>
op<5>
<0>
beq
jump
op<0>
RegWrite
ALUSrc
RegDst
MemtoReg
MemWrite
Branch
Jump
ExtOp
ALUop<2>
ALUop<1>
ALUop<0>
Putting it All Together: A Single Cycle Processor
ALUop
3
RegDst
op
6
Instr<31:26>
Main
Control
Rs
5
Rt
Rt
ALUctr
5
busA
0
1
32
MemtoReg
0
32
WrEn Adr
Data In 32
Clk
Imm16
MemWr
32
ALUSrc
ExtOp
Rd
Data
Memory
Mux
16
Extender
imm16
Instr<15:0>
32
Mux
32
Clk
Rw Ra Rb
32 32-bit
Registers
busB
32
ALU
busW
Zero
Rs
<0:15>
Clk
<11:15>
Instruction
Fetch Unit
1 Mux 0
RegWr 5
3
<16:20>
RegDst
Rt
ALUctr
Instruction<31:0>
nPC_sel
<21:25>
Rd
func
Instr<5:0> 6
ALUSrc
:
ALU
Control
1
An abstract view of the critical path - load instruction
Ideal
Instruction
Memory
Instruction
Rd Rs
5
5
Instruction
Address
Rt
5
Imm
16
A
32
32 32-bit
Registers
PC
32
Rw Ra Rb
32
ALU
B
Clk
Clk
Next Address
Critical Path (Load Operation) =
PC’s Clk-to-Q +
Instruction Memory’s Access Time +
Register File’s Access Time +
ALU to Perform a 32-bit Add +
Data Memory Access Time +
Setup Time for Register File Write +
Clock Skew
32
Data
Address
Data
In
Ideal
Data
Memory
Clk
Worst case delay for load is much longer than needed for all
other instructions, yet this sets the cycle time.
An Abstract View of the Implementation
Control
Ideal
Instruction
Memory
Instruction
Rd Rs
5
5
A
32
32 32-bit
Registers
PC
32
Rw Ra Rb
32
B
Clk
Clk
Conditions
Rt
5
ALU
Next Address
Instruction
Address
Control Signals
32
Datapath
° Logical vs. Physical Structure
Data
Address
Data
In
Clk
Ideal
Data
Memory
Data
Out
Summary
° 5 steps to design a processor
• 1. Analyze instruction set => datapath requirements
• 2. Select set of datapath components & establish clock
methodology
• 3. Design datapath meeting the requirements
• 4. Analyze implementation of each instruction to determine setting
of control points that effects the register transfer.
• 5. Design the control logic
° MIPS makes it easier
•
•
•
•
Instructions same size
Source registers always in same place
Immediates same size, location
Operations always on registers/immediates
° Single cycle datapath => CPI=1, CCT => long
° Next time: implementing control
Download