EECC550 - Shaaban

advertisement
CPU Design Steps
1. Analyze instruction set operations using independent
RTN => datapath requirements.
2. Select set of datapath components & establish clock
methodology.
3. Assemble datapath meeting the requirements.
4. Analyze implementation of each instruction to determine
setting of control points that effects the register transfer.
5. Assemble the control logic.
EECC550 - Shaaban
#1 Lec # 5 Winter 2000 12-20-2000
CPU Design & Implantation Process
• Bottom-up Design:
– Assemble components in target technology to establish critical timing.
• Top-down Design:
– Specify component behavior from high-level requirements.
• Iterative refinement:
– Establish a partial solution, expand and improve.
Instruction Set
Architecture
=>
processor
datapath
Reg. File
Mux
ALU
control
Reg
Cells
Mem
Decoder
Sequencer
Gates
EECC550 - Shaaban
#2 Lec # 5 Winter 2000 12-20-2000
Single Cycle MIPS Datapath:
4
Rd
Imm16
RegDst
ALUctr MemWr
Equal
Rd Rt
0
1
32
imm16
16
0
1
32
Data In
32
Clk
32
0
Mux
00
Clk
Extender
Clk
=
32
ALU
busW
Mux
PC
Mux
Adder
Rs Rt
5
5
busA
Rw Ra Rb
32 32-bit
Registers
busB
32
MemtoReg
RegWr 5
Adder
PC Ext
imm16
Rt
Instruction<31:0>
<0:15>
Rs
<11:15>
Adr
nPC_sel
<16:20>
<21:25>
Inst
Memory
CPI = 1, Long Clock Cycle
WrEn Adr
1
Data
Memory
ExtOp ALUSrc
EECC550 - Shaaban
#3 Lec # 5 Winter 2000 12-20-2000
Drawback of Single Cycle Processor
• Long cycle time.
• All instructions must take as much time as the slowest:
– Cycle time for load is longer than needed for all other
instructions.
• Real memory is not as well-behaved as idealized memory
– Cannot always complete data access in one (short) cycle.
EECC550 - Shaaban
#4 Lec # 5 Winter 2000 12-20-2000
ALU
Reg.
Wrt
Result Store
Data
Mem
MemWr
RegDst
RegWr
MemRd
MemWr
fun
Mem
Access
ExtOp
ALUSrc
ALUctr
Equal
op
Ext
Register
Fetch
Instruction
Fetch
PC
Next PC
nPC_sel
Abstract View of Single Cycle CPU
Main
Control
ALU
control
EECC550 - Shaaban
#5 Lec # 5 Winter 2000 12-20-2000
Single Cycle Instruction Timing
Arithmetic & Logical
PC
Inst Memory
Reg File
mux
ALU
mux
setup
Load
PC
Inst Memory
mux
Reg File
Critical Path
ALU
Data Mem
Store
PC
Inst Memory
Reg File
ALU
Data Mem
Branch
PC
Inst Memory
Reg File
mux
cmp
mux setup
mux
EECC550 - Shaaban
#6 Lec # 5 Winter 2000 12-20-2000
Reducing Cycle Time: Multi-Cycle Design
• Cut combinational dependency graph by inserting registers / latches.
• The same work is done in two or more fast cycles, rather than one slow
cycle.
storage element
storage element
Acyclic
Combinational
Logic (A)
Acyclic
Combinational
Logic
=>
storage element
storage element
Acyclic
Combinational
Logic (B)
storage element
EECC550 - Shaaban
#7 Lec # 5 Winter 2000 12-20-2000
Clock Cycle Time & Critical Path
Clk
.
.
.
.
.
.
.
.
.
.
.
.
• Critical path: the slowest path between any two storage devices
• Cycle time is a function of the critical path
• must be greater than:
– Clock-to-Q + Longest Path through the Combination Logic +
Setup
EECC550 - Shaaban
#8 Lec # 5 Winter 2000 12-20-2000
Instruction Processing Cycles
Instruction
Obtain instruction from program storage
Fetch
Next
Update program counter to address
Instruction
of next instruction
Instruction
Determine instruction type
Decode
Obtain operands from registers
Execute
Compute result value or status
Result
Store result in register/memory if needed
Store
(usually called Write Back).
}
Common
steps
for all
instructions
EECC550 - Shaaban
#9 Lec # 5 Winter 2000 12-20-2000
Partitioning The Single Cycle Datapath
Result Store
MemWr
MemRd
MemWr
RegDst
RegWr
Reg.
File
Data
Mem
Exec
Mem
Access
ALUctr
ALUSrc
ExtOp
Operand
Fetch
Instruction
Fetch
PC
Next PC
nPC_sel
Add registers between smallest steps
EECC550 - Shaaban
#10 Lec # 5 Winter 2000 12-20-2000
B
MemToReg
MemRd
MemWr
ALUSrc
ALUctr
R
RegDst
Reg.
RegWr
File
Equal
A
Mem
Acces
s
Reg
File
Ext
ALU
ExtOp
IR
PC
Result Store
Data
Mem
Operand
Fetch
M
Instruction
Fetch
Next PC
nPC_sel
Example Multi-cycle Datapath
Registers added:
IR:
Instruction register
A, B: Two registers to hold operands read from register file.
R:
or ALUOut, holds the output of the ALU
M:
or Memory data register (MDR) to hold data read from data memory
EECC550 - Shaaban
#11 Lec # 5 Winter 2000 12-20-2000
Operations In Each Cycle
R-Type
Logic
Immediate
Load
Store
Branch
Instruction
Fetch
IR Mem[PC]
IR  Mem[PC]
IR  Mem[PC]
IR  Mem[PC]
Instruction
Decode
A  R[rs]
A  R[rs]
A  R[rs]
A  R[rs]
A 
B  R[rt]
B  R[rt]
B  R[rt]
IR  Mem[PC]
R[rs]
If Equal = 1
PC  PC + 4 +
Execution
R A + B
R  A OR ZeroExt[imm16]
R A + SignEx(Im16)
R  A + SignEx(Im16)
(SignExt(imm16) x4)
else
PC  PC + 4
Memory
M Mem[R]
Mem[R]

B
PC  PC + 4
Write
Back
 M
R[rd]  R
R[rt]  R
R[rd]
PC  PC + 4
PC  PC + 4
PC  PC + 4
EECC550 - Shaaban
#12 Lec # 5 Winter 2000 12-20-2000
Finite State Machine (FSM) Control Model
• State specifies control points for Register Transfer.
• Transfer occurs upon exiting state (same falling edge).
inputs (conditions)
Next State
Logic
State X
Control State
Register Transfer
Control Points
Depends on Input
Output Logic
outputs (control points)
EECC550 - Shaaban
#13 Lec # 5 Winter 2000 12-20-2000
Control Specification For Multi-cycle CPU
Finite State Machine (FSM)
“instruction fetch”
IR  MEM[PC]
“decode / operand fetch”
A  R[rs]
B  R[rt]
R A or ZX
R[rd] R
PC  PC + 4
R[rt]  R
PC PC + 4
To instruction fetch
LW
SW
BEQ & Equal
BEQ & ~Equal
PC  PC + 4
R A + SX
R A + SX
M MEM[R]
MEM[R]  B
PC PC + 4
R[rt]  M
PC  PC + 4
To instruction fetch
PC PC +
SX || 00
To instruction fetch
Write-back
R A fun B
ORi
Memory
Execute
R-type
EECC550 - Shaaban
#14 Lec # 5 Winter 2000 12-20-2000
Traditional FSM Controller
next
state op cond state
control points
Truth or Transition Table
11
next
State
control points
Equal
6
4
op
State
To datapath
datapath State
EECC550 - Shaaban
#15 Lec # 5 Winter 2000 12-20-2000
Traditional FSM Controller
datapath + state diagram => control
• Translate RTN statements into
control points.
• Assign states.
• Implement the controller.
EECC550 - Shaaban
#16 Lec # 5 Winter 2000 12-20-2000
Mapping RTNs To Control Points Examples
& State Assignments
IR  MEM[PC]
“instruction fetch”
0000
imem_rd, IRen
A  R[rs]
B  R[rt]
Aen, Ben
“decode / operand fetch”
0001
ALUfun, Sen
ORi
LW
R A or ZX
R A + SX
0110
1000
RegDst,
RegWr,
PCen
M MEM[S]
1001
BEQ & Equal
SW
BEQ & ~Equal
R A + SX
1011
MEM[S]  B
PC PC + 4
1100
R[rd] R
PC  PC + 4
R[rt]  R
PC PC + 4
0101
0111
To instruction fetch state 0000
R[rt]  M
PC  PC + 4
1010
To instruction fetch state 0000
PC  PC + 4
0011
PC PC +
SX || 00
0010
To instruction fetch
state 0000
Write-back
R A fun B
0100
Memory
Execute
R-type
EECC550 - Shaaban
#17 Lec # 5 Winter 2000 12-20-2000
Detailed Control Specification
State Op field Eq Next IR PC
en sel
BEQ
R
ORI
LW
SW
0000
0001
0001
0001
0001
0001
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
??????
BEQ
BEQ
R-type
orI
LW
SW
xxxxxx
xxxxxx
xxxxxx
xxxxxx
xxxxxx
xxxxxx
xxxxxx
xxxxxx
xxxxxx
xxxxxx
xxxxxx
?
0
1
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
0001 1
0011
0010
0100
0110
1000
1011
0000
1 1
0000
1 0
0101
0000
1 0
0111
0000
1 0
1001
1010
0000
1 0
1100
0000
1 0
Ops
AB
Exec
Ex Sr ALU S
Mem
RWM
Write-Back
M-R Wr Dst
11
11
11
11
11
11
0 1 fun 1
0 1 1
0 0 or
1
0 1 0
1 0 add 1
1 0 0
1 1 0
1 0 add 1
0 1
EECC550 - Shaaban
#18 Lec # 5 Winter 2000 12-20-2000
Alternative Multiple Cycle Datapath (In Textbook)
• Miminizes Hardware: 1 memory, 1 adder
PCWr
PCWrCond
Zero
MemWr
IRWr
RegDst
32
32
Rt 0
5
Rd
0
ExtOp
Rb busA
Reg File 32
1
Rw
1
1 Mux 0
Imm 16
Ra
busW busB 32
<< 2
4
Zero
32
0
1
32
ALU Out
1
32
Mux
Ideal
Memory
WrAdr
32
Din Dout
Rt
5
32
ALU
Mux
RAdr
Instruction Reg
0
Target
1
0
Rs
Mux
32
32
ALUSelA
32
PC
32
RegWr
BrWr
Mux
IorD
PCSrc
32
2
3
ALU
Control
Extend
32
ALUOp
MemtoReg
ALUSelB
EECC550 - Shaaban
#19 Lec # 5 Winter 2000 12-20-2000
Alternative Multiple Cycle Datapath (In Textbook)
•Shared instruction/data memory unit
• A single ALU shared among instructions
• Shared units require additional or widened multiplexors
• Temporary registers to hold data between clock cycles of the instruction:
• Additional registers: Instruction Register (IR),
Memory Data Register (MDR), A, B, ALUOut
EECC550 - Shaaban
#20 Lec # 5 Winter 2000 12-20-2000
Operations In Each Cycle
R-Type
Instruction
Fetch
IR Mem[PC]
PC  PC + 4
A  R[rs]
Instruction
Decode
Execution
B  R[rt]
Logic
Immediate
IR  Mem[PC]
PC  PC + 4
Load
Store
IR  Mem[PC]
PC  PC + 4
IR  Mem[PC]
PC  PC + 4
A  R[rs]
A  R[rs]
A 
B  R[rt]
B  R[rt]
B  R[rt]
ALUout  PC +
(SignExt(imm16)
x4)
ALUout  PC +
ALUout  A + B
ALUout
(SignExt(imm16) x4)

A OR ZeroExt[imm16]
ALUout  PC +
(SignExt(imm16) x4)
ALUout 
A + SignEx(Im16)
Branch
IR  Mem[PC]
PC  PC + 4
A 
R[rs]
R[rs]
B  R[rt]
ALUout  PC +
ALUout  PC +
(SignExt(imm16) x4)
(SignExt(imm16) x4)
If Equal = 1
ALUout 
PC ALUout
A + SignEx(Im16)
Memory
M Mem[ALUout]
Write
Back
R[rd] ALUout
R[rt]  ALUout
R[rd]
Mem[ALUout]

B
 Mem
EECC550 - Shaaban
#21 Lec # 5 Winter 2000 12-20-2000
High-Level View of Finite State
Machine Control
•
•
•
•
First steps are independent of the instruction class
Then a series of sequences that depend on the instruction opcode
Then the control returns to fetch a new instruction.
Each box above represents one or several state.
EECC550 - Shaaban
#22 Lec # 5 Winter 2000 12-20-2000
Instruction Fetch and Decode
FSM States
EECC550 - Shaaban
#23 Lec # 5 Winter 2000 12-20-2000
Load/Store Instructions FSM States
EECC550 - Shaaban
#24 Lec # 5 Winter 2000 12-20-2000
R-Type Instructions
FSM States
EECC550 - Shaaban
#25 Lec # 5 Winter 2000 12-20-2000
Branch Instruction
Single State
Jump Instruction
Single State
EECC550 - Shaaban
#26 Lec # 5 Winter 2000 12-20-2000
EECC550 - Shaaban
#27 Lec # 5 Winter 2000 12-20-2000
Finite State Machine (FSM) Specification
IR MEM[PC]
PC  PC + 4
“instruction fetch”
0000
A  R[rs]
B  R[rt]
“decode”
ALUout
PC +SX
0001
LW
ALUout
A fun B
ALUout
 A op ZX
ALUout
 A + SX
0100
0110
1000
M 
MEM[ALUout]
1001
BEQ
SW
ALUout
 A + SX
1011
If A = B then
PC  ALUout
0010
MEM[ALUout]
B
To instruction fetch
Write-back
ORi
Memory
Execute
R-type
1100
R[rd]
ALUout
R[rt]
ALUout
0101
0111
R[rt] M
1010
To instruction fetch
To instruction fetch
EECC550 - Shaaban
#28 Lec # 5 Winter 2000 12-20-2000
MIPS Multi-cycle Datapath
Performance Evaluation
• What is the average CPI?
– State diagram gives CPI for each instruction type
– Workload below gives frequency of each type
Type
CPIi for type
Frequency
CPIi x freqIi
Arith/Logic
4
40%
1.6
Load
5
30%
1.5
Store
4
10%
0.4
branch
3
20%
0.6
Average CPI:
4.1
Better than CPI = 5 if all instructions took the same number
of clock cycles (5).
EECC550 - Shaaban
#29 Lec # 5 Winter 2000 12-20-2000
Download