CPU Organization (Design) Datapath Design:

advertisement
ISA
Requirements
CPU Organization (Design)
• Datapath Design:
Components & their connections needed by ISA instructions
– Capabilities & performance characteristics of principal
Functional Units (FUs) needed by ISA instructions
– (e.g., Registers, ALU, Shifters, Logic Units, ...) Components
– Ways in which these components are interconnected (buses
connections, multiplexors, etc.). Connections
– How information flows between components.
• Control Unit Design:
Control/sequencing of operations of datapath components
to realize ISA instructions
– Logic and means by which such information flow is controlled.
– Control and coordination of FUs operation to realize the targeted
Instruction Set Architecture to be implemented (can either be
implemented using a finite state machine or a microprogram).
• Hardware description with a suitable language, possibly
using Register Transfer Notation (RTN).
4th Edition Chapter 4.1-4.4 - 3rd Edition Chapter 5.1-5.4
EECC550 - Shaaban
#1 Lec # 4 Winter 2012 12-11-2012
Major CPU Design Steps
1 Analyze instruction set to get datapath requirements:
– Using independent RTN, write the micro-operations required for target ISA
instructions.
1
2
• This provides the the required datapath components and how they are
connected.
2 Select set of datapath components and establish clocking methodology
(defines when storage or state elements can read and when they can be
written, e.g clock edge-triggered) e.g Flip-Flops
3 Assemble datapath meeting the requirements.
4 Identify and define the function of all control lines, points or signals
needed by the datapath.
– Analyze implementation of each instruction to determine setting of control
points that affects its operations.
5 Control unit design, based on micro-operation timing and control
signals identified:
– Combinational logic: For single cycle CPU. e.g Any instruction completed in one cycle
i.e CPI = 1
– Hard-Wired: Finite-state machine implementation.
– Microprogrammed.
ISA Requirements  CPU Design
EECC550 - Shaaban
#2 Lec # 4 Winter 2012 12-11-2012
CPU Design & Implantation Process
• Top-down Design:
– Specify component behavior from high-level requirements (ISA).
• Bottom-up Design:
– Assemble components in target technology to establish critical timing
(hardware delays, critical path timing).
• Iterative refinement:
– Establish a partial solution, expand and improve.
Instruction Set
Architecture (ISA):
Provides
Requirements
Reg. File
Mux
Processor
Datapath
ALU
Target VLSI implementation
Technology
Reg
Cells
ISA Requirements
CPU Design
Control
Mem
Decoder
Sequencer
Gates
EECC550 - Shaaban
#3 Lec # 4 Winter 2012 12-11-2012
Datapath Design Steps
• Write the micro-operation sequences required for a number of
representative target ISA instructions using independent RTN.
• Independent RTN statements specify: the required datapath
components and how they are connected.
1
2
• From the above, create an initial datapath by determining
possible destinations for each data source (i.e registers, ALU).
– This establishes connectivity requirements (data paths, or
connections) for datapath components.
– Whenever multiple sources are connected to a single input,
a multiplexor of appropriate size is added.
(or destination)
• Find the worst-time propagation delay (critical path) in the datapath
to determine the datapath clock cycle (CPU clock cycle, C).
• Complete the micro-operation sequences for all remaining
instructions adding datapath components +
connections/multiplexors as needed.
EECC550 - Shaaban
#4 Lec # 4 Winter 2012 12-11-2012
MIPS Instruction Formats
31
R-Type
I-Type: ALU
26
op
•
•
•
•
•
•
rt
11
6
0
rd
shamt
funct
5 bits
5 bits
5 bits
5 bits
6 bits
[31:26]
[25:21]
[20:16]
[15:11]
[10:6]
[5:0]
26
op
21
rs
16
0
Immediate (imm16)
rt
6 bits
5 bits
5 bits
16 bits
[31:26]
[25:21]
[20:16]
[15:0]
31
J-Type: Jumps
rs
16
6 bits
31
Load/Store, Branch
21
26
op
Or address offset
0
target address
6 bits
26 bits
[31:26]
[25:0]
op: Opcode, operation of the instruction.
rs, rt, rd: The source and destination register specifiers.
shamt: Shift amount.
funct: Selects the variant of the operation in the “op” field.
address / immediate: Address offset or immediate value.
target address: Target address of the jump instruction.
EECC550 - Shaaban
#5 Lec # 4 Winter 2012 12-11-2012
MIPS R-Type (ALU) Instruction Fields
R-Type: All ALU instructions that use three registers
1st operand
Opcode for
R-Type= 0
OP
rs
6 bits
[31:26]
•
•
•
•
•
•
2nd operand
Destination
rt
rd
shamt
funct
5 bits
5 bits
5 bits
5 bits
6 bits
[25:21]
[20:16]
[15:11]
[10:6]
[5:0]
Function
Field
Rs, rt , rd
op: Opcode, basic operation of the instruction.
are register specifier fields
– For R-Type op = 0
Independent RTN:
rs: The first register source operand.
Instruction Word  Mem[PC]
rt: The second register source operand.
R[rd]  R[rs] funct R[rt]
rd: The register destination operand.
PC  PC + 4
shamt: Shift amount used in constant shift operations.
funct: Function, selects the specific variant of operation in the op field.
Funct field value examples:
Add = 32 Sub = 34 AND = 36 OR =37 NOR = 39
Operand register in rs
Destination register in rd
Examples:
add $1,$2,$3
sub $1,$2,$3
R-Type = Register Type
Register Addressing used (Mode 1)
Operand register in rt
and $1,$2,$3
or $1,$2,$3
EECC550 - Shaaban
#6 Lec # 4 Winter 2012 12-11-2012
MIPS ALU I-Type Instruction Fields
I-Type ALU instructions that use two registers and an immediate value
Loads/stores, conditional branches.
1st operand
•
•
•
•
Destination
OP
rs
rt
6 bits
5 bits
5 bits
[31:26]
[25:21]
[20:16]
2nd operand
Immediate (imm16)
16 bits imm16
[15:0]
Independent RTN for addi:
op: Opcode, operation of the instruction.
Instruction Word  Mem[PC]
rs: The register source operand.
R[rt]  R[rs] + imm16
PC  PC + 4
rt: The result destination register.
immediate: Constant second operand for ALU instruction.
imm16
Source operand register in rs
OP = 8
Examples:
OP = 12
Result register in rt
add immediate:
addi $1,$2,100
and immediate
andi $1,$2,10
I-Type = Immediate Type
Immediate Addressing used (Mode 2)
Constant operand
in immediate
EECC550 - Shaaban
imm16 = 16 bit immediate field
#7 Lec # 4 Winter 2012 12-11-2012
MIPS Load/Store I-Type Instruction Fields
Base
•
•
•
•
Src./Dest.
OP
rs
rt
address (e.g. offset)
6 bits
5 bits
5 bits
[31:26]
[25:21]
[20:16]
16 bits imm16
[15:0]
Signed address
offset in bytes
op: Opcode, operation of the instruction.
– For load word op = 35, for store word op = 43.
rs: The register containing memory base address.
rt: For loads, the destination register. For stores, the source
register of value to be stored.
address: 16-bit memory address offset in bytes added to base
imm16
register.
Sign
Extended
base register in rs
Examples:
source register in rt
Store word:
Load word:
Destination register in rt
Base or Displacement Addressing used (Mode 3)
Offset
sw $3, 500($4)
lw $1, 32($2)
Offset
Instruction Word  Mem[PC]
Mem[R[rs] + imm16]  R[rt]
PC  PC + 4
Instruction Word  Mem[PC]
R[rt]  Mem[R[rs] + imm16]
PC  PC + 4
base register in rs
EECC550 - Shaaban
imm16 = 16 bit immediate field
#8 Lec # 4 Winter 2012 12-11-2012
MIPS Branch I-Type Instruction Fields
•
•
•
•
OP
rs
rt
address (e.g. offset)
6 bits
5 bits
5 bits
16 bits imm16
[31:26]
[25:21]
[20:16]
[15:0]
Signed address
offset in words
op: Opcode, operation of the instruction.
Word = 4 bytes
rs: The first register being compared
rt: The second register being compared.
address: 16-bit memory address branch target offset in words
imm16
added to PC to form branch address.
Register in rt
Register in rs
OP = 4
Examples:
OP = 5
offset in bytes equal to
instruction address field x 4
Imm16 x 4
Branch on equal
beq $1,$2,100
Branch on not equal
bne $1,$2,100
Independent RTN for beq:
Added
to PC+4 to form
branch target
Sign extended
Instruction Word  Mem[PC]
R[rs] = R[rt] :
PC  PC + 4 + imm16 x 4
R[rs]  R[rt] :
PC  PC + 4
PC-Relative Addressing used (Mode 4)
EECC550 - Shaaban
imm16 = 16 bit immediate field
#9 Lec # 4 Winter 2012 12-11-2012
MIPS J-Type Instruction Fields
J-Type: Include jump j, jump and link jal
•
•
OP
jump target
6 bits
26 bits
[31:26]
[25:0]
Word = 4 bytes
op: Opcode, operation of the instruction.
– Jump j op = 2
– Jump and link jal op = 3
jump target: jump memory address in words.
Examples:
Jump memory address in bytes equal to
instruction field jump target x 4
Jump
j 10000
Jump and link
jal 10000
Effective 32-bit jump address:
PC(31-28)
From
PC+4
4 bits
Independent RTN for j:
Jump target
in words
PC(31-28),jump_target,00
jump target = 2500
26 bits
0 0
2 bits
Instruction Word  Mem[PC]
PC  PC + 4
PC  PC(31-28),jump_target,00
J-Type = Jump Type Pseudodirect Addressing used (Mode 5)
EECC550 - Shaaban
#10 Lec # 4 Winter 2012 12-11-2012
A Subset of MIPS Instructions
ADD and SUB:
add rd, rs, rt
R
sub rd, rs, rt
32 = add
34 = sub
0
31
26
op
I OR Immediate:
rs
I BRANCH:
rd
shamt
funct
5 bits
6 bits
[31:26]
[25:21]
[20:16]
[15:11]
[10:6]
[5:0]
26
21
rs
16
rt
0
Immediate (imm16)
6 bits
5 bits
5 bits
16 bits
[31:26]
[25:21]
[20:16]
[15:0]
21
rs
16
rt
0
Immediate (imm16)
6 bits
5 bits
5 bits
16 bits
[31:26]
[25:21]
[20:16]
[15:0]
26
op
4
0
5 bits
31
beq rs, rt, imm16
6
5 bits
LOAD and STORE Word
I
lw rt, rs, imm16
31
26
sw rt, rs, imm16
op
35 = lw
43 = sw
rt
11
5 bits
op
13
16
6 bits
31
ori rt, rs, imm16
21
6 bits
[31:26]
21
rs
5 bits
[25:21]
Offset in bytes
16
rt
5 bits
[20:16]
0
Immediate (imm16)
16 bits
Offset in words
[15:0]
EECC550 - Shaaban
#11 Lec # 4 Winter 2012 12-11-2012
Basic MIPS Instruction Processing Steps
Instruction/program Memory
Instruction
Fetch
Obtain instruction from program storage
Instruction  Mem[PC]
Next
Update program counter to address
PC  PC + 4
Instruction
of next instruction
Instruction
Determine instruction type
Decode
Obtain operands from registers
Execute
Compute result value or status
}
Done by
Control Unit
(Based on Opcode)
Result
Store result in register/memory if needed
Store
(usually called Write Back).
T = I x CPI x C
Common
steps
for all
instructions
EECC550 - Shaaban
#12 Lec # 4 Winter 2012 12-11-2012
Overview of MIPS Instruction Micro-operations
•
Common
Steps
•
All instructions go through these common steps:
– Send program counter to instruction memory and fetch the instruction.
(fetch)
Instruction  Mem[PC]
– Update the program counter to point to next instruction PC  PC + 4
– Read one or two registers, using instruction fields. (decode)
• Load reads one register only.
Additional instruction execution actions (execution) depend on the
instruction in question, but similarities exist:
– All instruction classes (except J type) use the ALU after reading the
registers:
• Memory reference instructions use it for effective address calculation.
• Arithmetic and logic instructions (R-Type), use it for the specified
operation.
• Branches use it for comparison.
•
Additional execution steps where instruction classes differ:
– Memory reference instructions: Access memory for a load or store.
– Arithmetic and logic instructions: Write ALU result back in register.
– Branch instructions: Possibly change next instruction address (update PC)
based on comparison (i.e if branch is taken).
EECC550 - Shaaban
#13 Lec # 4 Winter 2012 12-11-2012
A Single Cycle MIPS CPU Design
Design target: A single-cycle per instruction MIPS CPU design
All micro-operations of an instruction are to be carried out in a single
CPU clock cycle. Cycles Per Instruction = CPI = 1
CPU Performance Equation:
T = I x CPI x C
4
Abstract view of single cycle MIPS
CPU showing major functional units
(components) and major connections
between them
CPI = 1
Ad d
Ad d
32
32
32
32
Data
Register #
PC
A ddress
Instruction
ISA
Re g ist e r s
ALU
A ddress
Register #
In st r uct ion
m e m or y
Dat a
m e m or y
Register #
32
4th Edition Figure 4.1 page 302 - 3rd Edition Figure 5.1 page 287
Data
EECC550 - Shaaban
#14 Lec # 4 Winter 2012 12-11-2012
R-Type Example:
Micro-Operation Sequence For ADD
add rd, rs, rt
0
32 = add
34 = sub
0
OP
6 bits
[31:26]
rs
rt
rd
shamt
funct
5 bits
5 bits
5 bits
5 bits
6 bits
[25:21]
[20:16]
[15:11]
[10:6]
[5:0]
Instruction Word  Mem[PC]
Fetch the instruction
PC  PC + 4
Increment PC
R[rd]  R[rs] + R[rt]
Program
Memory
Common
Steps
Add register rs to register rt result
in register rd
i.e Funct =add = 32
Independent RTN ?
EECC550 - Shaaban
#15 Lec # 4 Winter 2012 12-11-2012
Initial Datapath Components
Instruction  Mem[PC]
Three components needed by: Instruction Fetch:
Program Counter Update: PC  PC + 4
32
I nstruction
address
32
I nstruction
32
In st r uct ion
m e m or y
a. Instruction memory
Add Sum
PC
32
32
32
Instruction
Word
b. Program counter
32
32-bit
c. Adder
Two state elements (memory) needed to store and access instructions:
1 Instruction memory:
• Only read access (by user code). No read control signal needed.
2 Program counter (PC): 32-bit register.
• Written at end of every clock cycle (edge-triggered) : No write control signal.
3 32-bit Adder: To compute the the next instruction address (PC + 4).
4th Edition Figure 4.5, page 308 - 3rd Edition Figure 5.5, page 293
+ Basics of logic design/logic building blocks review in Appendix C in Book CD (4 th Edition Appendix B)
EECC550 - Shaaban
#16 Lec # 4 Winter 2012 12-11-2012
Building The Datapath
Instruction Fetch
& PC Update: 2
1
Instruction  Mem[PC]
PC PC + 4
32
Add
32
4
32
2
PC  PC + 4
PC
32
Read
address
I nstruction
Portion of the datapath
used for fetching instructions
and incrementing the program
counter (PC).
32
In st r uct ion
m e m or y
PC write or update is edge triggered at the end of the cycle
4th
Edition Figure 4.6 page 309 -
3rd
Edition Figure 5.6 page 293
1
Instruction 
Mem[PC]
Clock input to PC, memory not shown
EECC550 - Shaaban
#17 Lec # 4 Winter 2012 12-11-2012
More Datapath Components
ISA Register File
5
Register
numbers
5
5
Data
e.g add = 0010
Main 32-bit ALU
Read
register 1
Read
data 1
4
R[rs]
32
Read
register 2
Re g ist e r s
Write
R[rt]
register
Read
data 2 32
Write
ALU operation
(Function)
32
Zero
Data
ALU ALU
result
32
32
Data
RegWrite
32-bit Arithmetic and Logic Unit (ALU)
a. Registers
Register File:
b. ALU
Zero = Zero flag = 1
When ALU result equals zero
• Contains all ISA registers.
• Two read ports and one write port.
• Register writes by asserting write control signal
• Clocking Methodology: Writes are edge-triggered.
• Thus can read and write to the same register in the same clock cycle.
4th Edition Figure 4.7, page 310 - 3rd Edition Figure 5.7, page 295
+ Basics of logic design/logic building blocks review in Appendix C in Book CD (4 th Edition Appendix B)
EECC550 - Shaaban
#18 Lec # 4 Winter 2012 12-11-2012
Register File Details
RW RA RB
Write Enable 5
5
5
• Register File consists of 32 registers:
busA
Write Data
– Two 32-bit output busses:
busW
32
32 32-bit
busA and busB
32
Registers busB
Clk
– One 32-bit input bus: busW
32
• Register is selected by:
– RA (number) selects the register to put on busA (data):
busA = R[RA]
– RB (number) selects the register to put on busB (data):
busB = R[RB]
– RW (number) selects the register to be written
via busW (data) when Write Enable is 1
Write Enable: R[RW]  busW
• Clock input (CLK)
– The CLK input is a factor ONLY during write operations.
– During read operation, it behaves as a combinational logic block:
• RA or RB valid => busA or busB valid after “access time.”
EECC550 - Shaaban
#19 Lec # 4 Winter 2012 12-11-2012
A Possible Register File Implementation
Register Write Enable (RegWrite)
.
Write Port
Write
Write
Register
RW
5
rd?
rt?
5-to-32
Decoder
30
31
Write
.
..
.
Register 1
Data In
..
.
.
Write
.
.
Data In
Data
Out
Data
Out
Register 30
Register 31
32
.
.
2 Read Ports
0
1
..
.
R[rs]
32
32-to-1
MUX
Register
Read Data 1
(Bus A)
30
31
5
Read Register 1
(RA)
Data
Out
Data
Out
rs
32
0
1
32
Register Write Data (Bus W)
Clock input to registers
not shown in diagram
..
32
..
.
Data In
Write
32
Register 0
Data In
.
0
1
Each Register contains 32 edge triggered D-Flip Flops
RW RA RB
Write Enable 5 5 5
busA
busW
32
32 32-bit
32
Registers
busB
Clk
32
Also see Appendix C in Book CD (3rd Edition Appendix B) - The Basics of Logic Design
..
.
32-to-1
32
R[rt]
MUX
Register
Read Data 2
(Bus B)
30
31
5
Read Register 2
(RB)
rt
EECC550 - Shaaban
#20 Lec # 4 Winter 2012 12-11-2012
Idealized Memory
Write Enable
Address
Data In
DataOut
• Memory (idealized)
32
32
– One input bus: Data In.
Clk
– One output bus: Data Out.
Read Enable
• Memory word is selected by:
– Address selects the word to put on Data Out bus.
– Write Enable = 1: address selects the memory
word to be written via the Data In bus.
• Clock input (CLK):
– The CLK input is a factor ONLY during write operation,
– During read operation, this memory behaves as
a combinational logic block:
• Address valid => Data Out valid after “access time.”
• Ideal Memory = Short access time.
EECC550 - Shaaban
Compared to other components in CPU datapath
#21 Lec # 4 Winter 2012 12-11-2012
Clocking Methodology Used:
Edge Triggered Writes
Clk
Setup
Hold
Setup
Hold
Don’t Care
CLK-to-Q
.
.
.
CLK-to-Q
.
.
.
.
.
.
.
.
.
Critical Path (Longest delay path)
Clock
•
•
Clock
All storage element (e.g Flip-Flops, Registers, Data Memory) writes are triggered by
the same clock edge.
Cycle Time = CLK-to-Q + Longest Delay Path + Setup + Clock Skew
Here writes are triggered on the rising edge of the clock
EECC550 - Shaaban
#22 Lec # 4 Winter 2012 12-11-2012
Simplified Datapath For MIPS
R-Type Instructions
From
Instruction
Memory
[25:21]
rs
4
R[rs]
[20:16]
(Function)
e.g add = 0010
rt
32
[15:11]
rd
R[rt]
32
32
Components and connections as specified by RTN statement
Clock input to register bank not shown
32
R[rd]  R[rs] + R[rt]
i.e Funct = function =add
Destination register R[rd] write or update is
edge triggered at the end of the cycle
EECC550 - Shaaban
#23 Lec # 4 Winter 2012 12-11-2012
More Detailed Datapath
For R-Type Instructions
With Control Points Identified
Rd Rs
RegWr
5
5
Rw
32
Clk
Ra Rb
32 32-bit
Registers
ALUctr Function =Add, Subtract …
4
5
busA
R[rs]
32
busB
R[rt]
ALU
busW
Rt
Result
32
32
R[rd]  R[rs] + R[rt]
i.e Funct = function =add
EECC550 - Shaaban
#24 Lec # 4 Winter 2012 12-11-2012
R-Type Register-Register Timing
PC+4
Clk
Old
Value
Rs, Rt, Rd,
Op, Func
PC
Clk-to-Q
New Value
Old
Value
ALUctr
Old
Value
RegWr
Old
Value
busA,
B
Old
Value
busW
Old
Value
Instruction Memory Access Time
New Value
Delay through Control Logic
New Value
New Value
Register File Access Time
New Value
ALU Delay
New Value
Rd Rs Rt
RegWr 5 5
5
busA
R[rs]
32
busB
32
R[rt]
ALU
busW
32
Clk
Rw Ra Rb
32 32-bit
Registers
Register Write
Occurs Here
ALUct
r
Result
32
All register
writes occur on
falling edge of clock
(clocking methodology)
EECC550 - Shaaban
#25 Lec # 4 Winter 2012 12-11-2012
Logical Operations with Immediate Example:
Micro-Operation Sequence For ORI
ori rt, rs, imm16
31
26
op
13
21
16
rs
rt
0
Immediate (imm16)
6 bits
5 bits
5 bits
16 bits
[31:26]
[25:21]
[20:16]
[15:0]
Instruction Word  Mem[PC]
Fetch the instruction
PC  PC + 4
Increment PC
R[rt]  R[rs] OR ZeroExt[imm16]
OR register rs with immediate
field zero extended to 32 bits,
result in register rt
Common
Steps
Done by Main ALU
000 ….. 000
Not in book version
Imm16
EECC550 - Shaaban
#26 Lec # 4 Winter 2012 12-11-2012
Datapath For Logical
Instructions With Immediate
Rd
RegDst
1
Rt
Mux
RegWr
5
Rw
busW
2x1 MUX (width 5 bits)
0
Rs
Rt
5
5
ALUctr
R[rt]
busB
ZeroExt
16
Result
32
0
Mux
32
imm16
ALU
32
32 32-bit
Registers
32
Clk
R[rs]
busA
Ra Rb
Function = OR
2x1 MUX
(width 32 bits)
1
32
ALUSrc
R[rt]  R[rs] OR ZeroExt[imm16]
000 ….. 000
Imm16
EECC550 - Shaaban
#27 Lec # 4 Winter 2012 12-11-2012
Load Operations Example:
Micro-Operation Sequence For LW
lw rt, rs, imm16
31
26
op
35
21
rs
16
rt
0
Immediate (imm16)
6 bits
5 bits
5 bits
16 bits
[31:26]
[25:21]
[20:16]
[15:0]
Signed
Address offset
in bytes
Instruction Word  Mem[PC]
Fetch the instruction
PC  PC + 4
Increment PC
Instruction Memory
R[rt]  Mem[R[rs] + SignExt[imm16]]
Effective Address
Data Memory
To load from
Common
Steps
Immediate field sign extended to
32 bits and added to register rs
to form memory load address,
write word at load effective address
to register rt
EECC550 - Shaaban
#28 Lec # 4 Winter 2012 12-11-2012
Additional Datapath Components For
Loads & Stores
4th Edition Figure 4.8, page 311
3rd Edition Figure 5.8, page 296
M emWrite
Address
32
32
Write
data
Read
data
32
Dat a
m e m or y
M emRead
a. Data memory unit
Inputs:
for address and write (store) data
Output
for read (load) data
Data memory write or update is edge triggered at the
end of the cycle (clocking methodology)
16
Si g n
ext end
32
For SignExt[imm16]
b. Sign-extension unit
16-bit input sign-extended
into a 32-bit value at the output
EECC550 - Shaaban
#29 Lec # 4 Winter 2012 12-11-2012
Datapath For Loads
Rd
RegDst
1
32
Clk
Rs
5
5
ALUctr Function = add
Base Address register
Rw Ra Rb
32 32-bit
Registers
32
busB R[rt] 0
1
32
ALUSrc
0
32
MemWr
32 bit
2X1 Mux
Mu
x
Extender
16
Offset
Effective
Address
Mux
32
imm16
MemtoReg
R[rs]
busA
ALU
busW
Rt
Mux 0
RegWr 5
LW
WrEn Adr
Data In
32
Clk
ExtOp
Data
Memory
32
1
MemRd
R[rt]  Mem[R[rs] + SignExt[imm16]]
Data Memory
Effective Address
EECC550 - Shaaban
#30 Lec # 4 Winter 2012 12-11-2012
Store Operations Example:
Micro-Operation Sequence For SW
sw rt, rs, imm16
31
26
op
43
21
rs
16
rt
0
Immediate (imm16)
6 bits
5 bits
5 bits
16 bits
[31:26]
[25:21]
[20:16]
[15:0]
Signed
Address offset
in bytes
Instruction Word  Mem[PC]
Fetch the instruction
PC  PC + 4
Increment PC
Mem[R[rs] + SignExt[imm16]] R[rt]
Immediate field sign extended to
32 bits and added to register rs
to form memory store effective
address, register rt written to
memory at store effective address.
Effective Address
To store at
Data Memory
Common
Steps
EECC550 - Shaaban
#31 Lec # 4 Winter 2012 12-11-2012
Datapath For Stores
Rd
RegDst
1
Rt
Mux
32
Clk
5
Rt
5
Base Address register
Rw Ra Rb
32 32-bit
Registers
32
busB R[rt]
R[rt]
WrEn Adr
Data In 32
1
32
ExtOp
0
32
Clk
ALUSrc
Mu
x
Extender
16
Offset
Effective
Address
0
Mux
32
imm16
R[rs]
busA
ALU
busW
MemtoReg
MemWr
0
Rs
RegWr 5
ALUctr
Add =
SW
Data
Memory
32
1
MemRd
Mem[R[rs] + SignExt[imm16]] R[rt]
Data Memory
Effective Address
EECC550 - Shaaban
#32 Lec # 4 Winter 2012 12-11-2012
Conditional Branch Example:
Micro-Operation Sequence For BEQ
beq rs, rt, imm16
31
26
op
4
21
rs
16
rt
0
immediate
6 bits
5 bits
5 bits
16 bits
[31:26]
[25:21]
[20:16]
[15:0]
PC Offset
in words
Instruction Word  Mem[PC]
Fetch the instruction
PC  PC + 4
Increment PC
Zero  R[rs] - R[rt]
Calculate the branch condition
R[rs] == R[rt]
Condition
Action
(i.e
Common
Steps
R[rs] - R[rt] = 0 )
Then Zero = 1
Zero : PC  PC + ( SignExt(imm16) x 4 )
Calculate the next
instruction’s PC address
Branch Target
“Zero” is zero flag of main ALU
EECC550 - Shaaban
#33 Lec # 4 Winter 2012 12-11-2012
Main ALU evaluates branch condition
New adder to compute branch target:
• Sum of incremented PC and
Datapath For
Branch Instructions
sign-extended lower 16-bits on the
instruction.
PC4 from instruction datapath
New 32-bit Adder
(Third ALU)
for Branch Target
Sh if t
le f t 2
PC + 4 + ( SignExt(imm16) x 4
[25:21] rs Read
I nstruction
[20:16] rt
register 1
Read
data 1
Read
register 2
Re g is t e r s
Write
register
Read
4
R[rs]
ISA
ALU operation = Subtract
To branch
control logic
ALU Zero
R[rt]
data 2
Write
data
(Main ALU)
RegWrite
[15:0] imm16
Branch
target
Add Sum
SignExt(imm16) x 4
16
Zero flag =1
if R[rs] - R[rt] = 0
(i.e R[rs] = R[rt])
Zero  R[rs] - R[rt]
32
Sig n
ex t e n d SignExt(imm16)
4th Edition Figure 4.9, page 312 - 3rd Edition Figure 5.9, page 297
Main ALU Evaluates
Branch Condition
(subtract)
EECC550 - Shaaban
#34 Lec # 4 Winter 2012 12-11-2012
More Detailed Datapath
For Branch Operations
Branch
Zero
Zero
Instruction Address
32
0
PC
Mux
Adder
PC Ext
Sign extend
shift left 2
busW
00
Adder
32
PC+4
5
R[rs]
busA
32
R[rt]
busB
32
Clk
1
Main ALU
(subtract)
Branch
Target
Zero  R[rs] - R[rt]
Clk
New Third ALU
(adder)
5
Rw Ra Rb
32 32-bit
Registers
PC
Branch Target ALU
Rt
Equal?
4
imm16
Rs
RegWr 5
New 2X1 32-bit
MUX to select next PC value
EECC550 - Shaaban
#35 Lec # 4 Winter 2012 12-11-2012
Combining The Datapaths For Memory
Instructions and R-Type Instructions
4
[25:21] rs
R[rs]
32
[20:16] rt
R[rt]
32
0
1
32
1
0
R[rt]
32
rt/rd
MUX
not shown
SignExt(imm16)
[15:0] imm16
32
Highlighted muliplexors and connections added to combine the datapaths
of memory and R-Type instructions into one datapath
This is book version ORI not supported
4th
Edition Figure 4.10 Page 314 -
3rd
Edition Figure 5.10 Page 299
EECC550 - Shaaban
#36 Lec # 4 Winter 2012 12-11-2012
Instruction Fetch Datapath Added to
ALU R-Type and Memory Instructions Datapath
PC+ 4
32
32
Combination of Figure 4.10 (p. 314) and Figure 4.6 (p. 309)
[3rd Edition Figure 5.10 (p. 299) and Figure 5.6 (p. 293)]
PC
Add
4
PC
Read
address
rs
rt
I nstruction
In st r uct ion
m e m or y
Read
register 1
Read
register 2
Write
data
Re g ist e r s
16
0
M
u
x
1
A LU ALU
result
Address
Read
data
1
M
u
x
0
Dat a
m e m or y
R[rt]
Sig n
ex t e n d
M emtoReg
Zero
32
RegWrite
rt/rd
MUX
not shown
ALU operation
M emWrite
ALUSrc
Read R[rt]
data 2
Write
register
4
Read R[rs]
data 1
32
Write
data
32
M emRead
32
This is book version ORI not supported, no zero extend of immediate needed
EECC550 - Shaaban
#37 Lec # 4 Winter 2012 12-11-2012
A Simple Datapath For The MIPS Architecture
Datapath of branches and a program counter multiplexor are added.
Resulting datapath can execute in a single cycle the basic MIPS instruction:
- load/store word
- ALU operations
- Branches
Zero
32
32
PC +4
Branch
32
32
32
Branch Target
4
rs
R[rs]
rt
R[rt]
0
1
rt/rd
MUX
not shown
1
32
0
32
32
This is book version ORI not supported, no zero extend of immediate needed
4th Edition Figure 4.11 page 315 - 3rd Edition Figure 5.11 page 300
EECC550 - Shaaban
#38 Lec # 4 Winter 2012 12-11-2012
Main ALU Control
3rd Edition
• The main ALU has four control lines (detailed design in Appendix B)
with the following functions:
th
ALU Control Lines
4 Edition
Appendix C
ALU Function
0000
AND
0001
OR
0010
add
0110
subtract
0111
1100
Set-on-less-than
NOR
Not Used
• For our current subset of MIPS instructions only the top five
functions will be used (thus only three control lines will be used)
• For R-type instruction the ALU function depends on both the opcode
and the 6-bit “funct” function field
• For other instructions the ALU function depends on the opcode only.
• A local ALU control unit can be designed to accept 2-bit ALUop
control lines (from main control unit) and the 6-bit function field and
generate the correct 4-bit ALU control lines.
Or 3 bits depending on number
functions actually used
EECC550 - Shaaban
#39 Lec # 4 Winter 2012 12-11-2012
Local ALU Decoding of “func” Field
func
op
6
Main
Control
6
ALUop
Subtract = 01
Add = 00
Instruction
Operation
LW
SW
Branch Equal
R-Type
R-Type
R-Type
R-Type
R-Type
Load word
Store word
branch equal
add
subtract
AND
OR
set on less than
4
Or
3 bits
Opcode
Instruction
Opcode
ALUctr
ALU
2
ALU
Control
(Local)
Desired
ALUOp Funct Field ALU Action
00
00
01
10
10
10
10
10
R-Type = 10
XXXXXX
XXXXXX
XXXXXX
100000
100010
100100
100101
101010
add
add
subtract
add
subtract
and
or
set on less than
ALU Control
Lines
0010
0010
0110
0010
0110
0000
0001
0111
EECC550 - Shaaban
#40 Lec # 4 Winter 2012 12-11-2012
Local ALU Control Unit
Add = 00
Subtract = 01
R-type
=10
Add
Subtract
Add
Subtract
AND
OR
Set-On-less-Than
{
Page 302
(2 lines From main control unit)
2
Function
Field
3 ALU Control Lines
4th line = 0
More details found in Appendix D in Book CD – (3rd Edition Appendix C)
EECC550 - Shaaban
#41 Lec # 4 Winter 2012 12-11-2012
Single Cycle MIPS Datapath
Necessary multiplexors and control lines
are identified here and local ALU control added:
32
PCSrc
32
PC +4
32
Add
Add
4
Shift
left 2
RegWrite
Instruction [25:21]
PC
Read
address
Read
register 1
rs
Instruction [20:16]
Instruction
[31:0]
Instruction
memory
rt
Read
register 2
0
Instruction [15:11]
1
rd
M
u
x
RegDst
Instruction [15:0]
imm16
Write
register
Write
data
16
ALUSrc
R[rt]
0
1
M
u
x
MemtoReg
Zero
ALU ALU
result
Registers
Sign
extend
1
MemWrite
R[rs]
Read
data 1
M
u
x
Branch
Target
32
Read
data 2
32
ALU
result
Branch
0
PC +4
32
Zero
Address
Read
data
32
R[rt]
1
M
u
x
0
Data
Write memory
data
ALUOp (2-bits)
00 = add
01 = subtract
10 = R-Type
32
ALU
control
Function Field
Instruction [5:0]
ALUOp
MemRead
32
This is book version ORI not supported, no zero extend of immediate needed
4th Edition Figure 4.15 page 320 - 3rd Edition Figure 5.15 page 305
EECC550 - Shaaban
#42 Lec # 4 Winter 2012 12-11-2012
Putting It All Together: A Single Cycle Datapath
PCSrc
Branch
Zero
PC+4
Zero
Function
Field
0
32
imm16
16
(Includes ORI
not in book version)
1
=
32
Data In
32
ExtOp ALUSrc
MemWr
MemtoReg
Clk
32
0
Mux
Clk
Extender
Clk
00 = add
01 = subtract
10 = R-Type
Main
ALU
ALU
1
busW
Mux
PC
Mux
Adder
Rs Rt
5
5
R[rs]
busA
Rw Ra Rb
32
32 32-bit
R[rt]
Registers
busB
0
32
ALU
Control
RegWr 5
Branch
Target
e.g Sign Extend
+ Shift Left 2
ALUop
(2-bits)
Imm16
Rd Rt
0
1
Adder
PC Ext
imm16
Rd
RegDst
00
4
Rt
Instruction<31:0>
<0:15>
Rs
<11:15>
Adr
<16:20>
<21:25>
Inst
Memory
WrEn Adr
1
Data
Memory
MemRd
EECC550 - Shaaban
#43 Lec # 4 Winter 2012 12-11-2012
Instruction<31:0>
Rd
<0:25>
Rs
<0:15>
Rt
<11:15>
Op Fun
<16:20>
Adr
<21:25>
<21:25>
Instruction
Memory
Imm16 Jump_target
Control Unit
Control Lines
RegDst
ALUSrc
MemtoReg
RegWrite
Mem
Read
Mem
Write
Branch
ALOp
(2-bits)
DATA PATH
EECC550 - Shaaban
#44 Lec # 4 Winter 2012 12-11-2012
The Effect of The Control Signals
Signal
Name
Effect when deasserted (=0)
Effect when asserted (=1)
RegDst
The register destination number for the
write register comes from the rt field
(instruction bits 20:16).
The register destination number for the
write register comes from the rd field
(instruction bits 15:11).
RegWrite
None
The register on the write register input
is written with the value on the Write
data input.
ALUSrc
The second main ALU operand comes
from the second register file output
(Read data 2) R[rt]
The second main ALU operand is the
sign-extended lower 16 bits on the
instruction (imm16)
Branch
The PC is replaced by the output of the
adder that computes PC + 4
(BEQ)
If Zero =1 The PC is replaced by the output
of the adder that computes the branch target.
MemRead
None
Data memory contents designated by the
address input are put on the Read data
output.
MemWrite
None
MemtoReg
The value fed to the register write data
input comes from the main ALU.
Data memory contents designated by the
address input are replaced by the value on the
Write data input.
The value fed to the register write data
input comes from data memory.
EECC550 - Shaaban
#45 Lec # 4 Winter 2012 12-11-2012
Control Line Settings
Instruction
R-Format
RegDst ALUSrc Memto- Reg
Mem
Reg
Write Read
Control
Lines
Mem Branch ALUOp1 ALUOp0
Write
1
0
0
1
0
0
0
1
0
lw
0
1
1
1
1
0
0
0
0
sw
X
1
X
0
0
1
0
0
0
beq
X
0
X
0
0
0
1
0
1
ALUOp (2-bits)
4th Edition Figure 4.18 page 323
3rd Edition Figure 5.18 page 308
00 = add
01 = subtract
10 = R-Type
EECC550 - Shaaban
#46 Lec # 4 Winter 2012 12-11-2012
The Truth Table For The Main Control
(Opcode)
Similar to Figure 4.22 Page 327 (3rd Edition Figure 5.22 Page 312)
EECC550 - Shaaban
#47 Lec # 4 Winter 2012 12-11-2012
PLA Implementation of the Main Control
Control
Lines
To Datapath
Figure D.2.5 in Appendix D
(3rd Edition Figure C.2.5 in Appendix C)
PLA = Programmable Logic Array - Appendix C
(3rd
Edition Appendix B)
EECC550 - Shaaban
#48 Lec # 4 Winter 2012 12-11-2012
Single Cycle MIPS Datapath Control Unit Added
32
32
32
PC +4
Add
32
PC +4
M
u
x
ALU Branch 1
Add
Target
result
PC +4
32
4
Control
Opcode
PC
Read
address
Instruction [25–21] rs
Read
register 1
Instruction [20–16] rt
Read
register 2
Instruction
[31–0]
Instruction
memory
0
Instruction [15–11]
rd
Instruction [15–0]
imm16
4th Edition Figure 4.21, page 326
3rd Edition Figure 5.17, page 307
1
M
u
x
Shift
left 2
RegDst
Branch
MemRead
MemtoReg
ALUOp
MemWrite
ALUSrc
RegWrite
Instruction [31–26]
Write
register
Write
data
16
0
Read
data 1
R[rs]
Zero
R[rt]
Read
data 2
Registers
Sign
extend
0
1
M
u
x
ALU ALU
result
Address
Read
data
1
M
u
0x
Data
Write memory
data
32
ALU
control
Function Field
Instruction [5–0]
In this book version, ORI is not supported—no zero extend of immediate needed.
ALUOp (2-bits)
00 = add
01 = subtract
10 = R-Type
32
EECC550 - Shaaban
#49 Lec # 4 Winter 2012 12-11-2012
Adding Support For Jump:
Micro-Operation Sequence For Jump: J
j jump_target
2
OP
Jump_target
6 bits
26 bits
[31:26]
[25:0]
Jump address
in words
Instruction Word  Mem[PC]
Fetch the instruction
PC  PC + 4
Increment PC
PC  PC(31-28),jump_target,00
Update PC with jump address
Common
Steps
PC(31-28)
Jump
Address
jump target
4 bits
4 highest bits from PC + 4
26 bits
0 0
2 bits
EECC550 - Shaaban
#50 Lec # 4 Winter 2012 12-11-2012
Datapath For Jump
Branch
Zero
Next Instruction Address
32
4
PCSrc
PC+4
32
00
Adder
e.g Sign Extend
+ Shift Left 2
PC+4(31-28)
jump_target
PC(31-28)
4 bits
PC
1
Branch
Target
Jump
Address
Clk
PC(31-28),jump_target,00
Instruction(25-0)
26
0
PC
4
32
Mux
Mux
Adder
imm16
PC Ext
Instruction(15-0)
JUMP
Shift left 2
jump target
26 bits
28
32
Jump Address
0 0
EECC550
2 bits
- Shaaban
#51 Lec # 4 Winter 2012 12-11-2012
Single Cycle MIPS Datapath Extended To Handle Jump with
Control Unit Added
32
Instruction [25–0]
32
Jump address [31–0]
Shift
left 2
26
28
PC + 4 [31–28]
4
Add
PC +4
32
PC +4
32
0
M
u
x
PC +4
Add
4
ALU
result
Branch
Target
1
1
32
M
u
x
0
Shift
left 2
RegDst
Jump
Branch
Opcode
MemRead
Instruction [31–26]
MemtoReg
Control
ALUOp
MemWrite
ALUSrc
RegWrite
Instruction [25–21]
PC
Read
address
Instruction [20–16]
Instruction
[31–0]
Instruction
memory
Instruction [15–11]
Instruction [15–0]
imm16
3rd
rt
Read
register 1
Read
data 1
Read
register 2
Edition Figure 4.24 page 329
Edition Figure 5.24 page 314
1
M
u
x
Read
data 2
Write
register
Write
data
16
R[rs]
Zero
0
rd
4th
rs
ALU
R[rt]
0
M
u
x
ALU
result
Data
memory
1
Registers
Sign
extend
Address
R[rt]
Write
data
Read
data
1
0
M
u
x
32
32
ALU
control
Function Field
Instruction [5–0]
In this book version, ORI is not supported—no zero extend of immediate needed.
ALUOp (2-bits)
00 = add
01 = subtract
10 = R-Type
EECC550 - Shaaban
#52 Lec # 4 Winter 2012 12-11-2012
Control Line Settings
(with jump instruction, j added)
Instruction
RegDst ALUSrc Memto- Reg
Mem
Reg
Write Read
Mem
Write
Branch ALUOp1 ALUOp0 Jump
1
0
0
1
0
0
0
1
0
0
lw
0
1
1
1
1
0
0
0
0
0
sw
X
1
X
0
0
1
0
0
0
0
beq
X
0
X
0
0
0
1
0
1
0
j
X
X
X
0
0
0
X
X
X
1
R-Format
Figure 4.18 page 323 (3rd Edition Figure 5.18 page 308) modified to include j
EECC550 - Shaaban
#53 Lec # 4 Winter 2012 12-11-2012
Worst Case Timing (Load)
Clk
PC Old Value
Clk-to-Q
New Value
Instruction Memoey Access Time
New Value
Rs, Rt, Rd,
Op, Func
Old Value
ALUctr
Old Value
ExtOp
Old Value
New Value
ALUSrc
Old Value
New Value
MemtoReg
Old Value
New Value
RegWr
Old Value
New Value
busA
busB
Delay through Control Logic
New Value
Register
Write Occurs
Register File Access Time
New Value
Old Value
Delay through Extender & Mux
Old Value
New Value
ALU Delay
Address
Old Value
New Value
Data Memory Access Time
busW
Old Value
New
EECC550 - Shaaban
#54 Lec # 4 Winter 2012 12-11-2012
Instruction Timing Comparison
Arithmetic & Logical
PC
Inst Memory
Reg File
mux
ALU
mux
setup
Load
PC
Inst Memory
Reg File mux
Critical Path
ALU
Data Mem
Store
PC
Inst Memory
Reg File
ALU
Data Mem
Branch
PC
Inst Memory
Reg File
Jump
PC
Inst Memory
mux
cmp
mux setup
mux
mux
EECC550 - Shaaban
#55 Lec # 4 Winter 2012 12-11-2012
Simplified Single Cycle Datapath Timing
•
Assuming the following datapath/control hardware components delays:
–
–
–
–
Memory Units: 2 ns
ALU and adders: 2 ns
Register File: 1 ns
Control Unit < 1 ns
}
Obtained from low-level target VLSI
implementation technology of components
• Ignoring Mux and clk-to-Q delays, critical path analysis:
1 ns
Control
Unit
2 ns
2 ns
Instruction
Memory
Main
ALU
Register
Read
Critical Path
PC + 4
ALU
2 ns
1 ns
Data
Memory
Register
Write
Critical
Path = 8 ns (LW)
(Load)
Branch Target
ALU
Time
2 ns
0
2ns
3ns
4ns
5ns
7ns
8ns
EECC550 - Shaaban
ns = nanosecond =
10-9
second
#56 Lec # 4 Winter 2012 12-11-2012
Performance of Single-Cycle (CPI=1) CPU
•
Assuming the following datapath hardware components delays:
– Memory Units: 2 ns
– ALU and adders: 2 ns
– Register File: 1 ns
•
Nanosecond, ns = 10-9 second
The delays needed for each instruction type can be found :
Instruction
Class
Instruction
Memory
Register
Read
ALU
Operation
Data
Memory
ALU
2 ns
1 ns
2 ns
Load
2 ns
1 ns
2 ns
2 ns
Store
2 ns
1 ns
2 ns
2 ns
Branch
2 ns
1 ns
2 ns
Jump
2 ns
Register
Write
Total
Delay
1 ns
6 ns
1 ns
8 ns
Load has longest
delay of 8 ns
thus determining
the clock cycle of
the CPU to be 8ns
7 ns
5 ns
2 ns
C = 8 ns
•
The clock cycle is determined by the instruction with longest delay: The load
in this case which is 8 ns. Clock rate = 1 / 8 ns = 125 MHz
• A program with I = 1,000,000 instructions executed takes:
Execution Time = T = I x CPI x C = 106 x 1 x 8x10-9 = 0.008 s = 8 msec
T = I x CPI x C
EECC550 - Shaaban
#57 Lec # 4 Winter 2012 12-11-2012
Adding Support for jal to Single Cycle Datapath
• The MIPS jump and link instruction, jal is used to support
procedure calls by jumping to jump address (similar to j ) and
saving the address of the following instruction PC+4 in register
$ra ($31)
i.e. Return Address
R[31]  PC + 4
jal Address
PC  Jump Address
• jal uses the j instruction format:
op (6 bits)
Target address (26 bits)
• We wish to add jal to the single cycle datapath in Figure 4.24
page 329 (3rd Edition Figure 5.24 page 314) . Add any necessary
datapaths and control signals to the single-clock datapath and
justify the need for the modifications, if any.
• Specify control line values for this instruction.
EECC550 - Shaaban
#58 Lec # 4 Winter 2012 12-11-2012
jump and link, jal support to Single Cycle Datapath
Instruction Word  Mem[PC]
R[31]  PC + 4
PC  Jump Address
Jump Address
PC + 4
PC + 4
Branch Target
PC + 4
rs
R[rs]
rt
R[rt]
31
2
2
rd
imm16
1. Expand the multiplexor controlled by RegDst to include the value 31 as a new input 2.
2. Expand the multiplexor controlled by MemtoReg to have PC+4 as new input 2.
EECC550 - Shaaban
#59 Lec # 4 Winter 2012 12-11-2012
jump and link, jal support to Single Cycle Datapath
Adding Control Lines Settings for jal
(For Textbook Single Cycle Datapath including Jump)
RegDst
Is now 2 bits
RegDst
MemtoReg
Is now 2 bits
ALUSrc
MemtoReg
Reg
Write
Mem Mem
Read Write Branch ALUOp1 ALUOp0 Jump
R-format
01
0
00
1
0
0
0
1
0
0
lw
00
1
01
1
1
0
0
0
0
0
sw
xx
1
xx
0
0
1
0
0
0
0
beq
xx
0
xx
0
0
0
1
0
1
0
J
xx
x
xx
0
0
0
x
x
x
1
JAL
10
x
10
1
0
0
x
x
x
1
R[31]
PC+ 4
PC  Jump Address
Instruction Word  Mem[PC]
R[31]  PC + 4
PC  Jump Address
EECC550 - Shaaban
#60 Lec # 4 Winter 2012 12-11-2012
Load Word Register
Adding Support for LWR to Single Cycle Datapath
• We wish to add a variant of lw (load word) let’s call it LWR to
the single cycle datapath in Figure 4.24 page 329 (3rd Edition
Figure 5.24 page 314).
LWR $rd, $rs, $rt
• The LWR instruction is similar to lw but it sums two registers
(specified by $rs, $rt) to obtain the effective load address and
Loaded word from memory written to register rd
uses the R-Type format
• Add any necessary datapaths and control signals to the single
cycle datapath and justify the need for the modifications, if any.
• Specify control line values for this instruction.
EECC550 - Shaaban
#61 Lec # 4 Winter 2012 12-11-2012
LWR (R-format LW) support to Single Cycle Datapath
Instruction Word  Mem[PC]
PC  PC + 4
R[rd]  Mem[ R[rs] + R[rt] ]
No new components or connections are needed for the datapath
just the proper control line settings
Adding Control Lines Settings for LWR
(For Textbook Single Cycle Datapath including Jump)
MemtoReg
Reg
Write
Mem Mem
Read Write Branch ALUOp1 ALUOp0 Jump
RegDst
ALUSrc
R-format
1
0
0
1
0
0
0
1
0
0
lw
0
1
1
1
1
0
0
0
0
0
sw
x
1
x
0
0
1
0
0
0
0
beq
x
0
x
0
0
0
1
0
1
0
J
x
x
x
0
0
0
x
x
x
1
LWR
1
0
1
1
1
0
0
0
0
0
Add
rd
R[rt]
EECC550 - Shaaban
#62 Lec # 4 Winter 2012 12-11-2012
Jump Memory
Adding Support for jm to Single Cycle Datapath
• We wish to add a new instruction jm (jump memory) to the single
cycle datapath in Figure 4.24 page 329 (3rd Edition Figure 5.24 page
314).
jm offset($rs)
• The jm instruction loads a word from effective address (R[rs] + offset),
this is similar to lw except the loaded word is put in the PC instead of
register $rt.
• Jm used the I-format with field rt not used.
OP
rs
rt
6 bits
5 bits
5 bits
address (imm16)
Not Used
16 bits
• Add any necessary datapaths and control signals to the single cycle
datapath and justify the need for the modifications, if any.
• Specify control line values for this instruction.
EECC550 - Shaaban
#63 Lec # 4 Winter 2012 12-11-2012
Adding jump memory, jm support to Single Cycle Datapath
Instruction Word  Mem[PC]
PC  Mem[R[rs] + SignExt[imm16]]
1. Expand the multiplexor controlled by Jump to include the Read Data (data memory output)
as new input 2. The Jump control signal is now 2 bits
Jump 2
2
Jump
PC + 4
2
Branch Target
rs
R[rs]
rt
R[rt]
rd
imm16
EECC550 - Shaaban
#64 Lec # 4 Winter 2012 12-11-2012
Adding jm support to Single Cycle Datapath
Adding Control Lines Settings for jm
(For Textbook Single Cycle Datapath including Jump)
Jump
is now 2 bits
RegDst
ALUSrc
MemtoReg
Reg
Write
Mem Mem
Read Write Branch ALUOp1 ALUOp0 Jump
R-format
1
0
0
1
0
0
0
1
0
00
lw
0
1
1
1
1
0
0
0
0
00
sw
x
1
x
0
0
1
0
0
0
00
beq
x
0
x
0
0
0
1
0
1
00
J
x
x
x
0
0
0
x
x
x
01
Jm
x
1
x
0
1
0
x
0
0
10
add
PC Mem[R[rs] + SignExt[imm16]]
EECC550 - Shaaban
#65 Lec # 4 Winter 2012 12-11-2012
Drawbacks of Single Cycle Processor
1. Long cycle time:
– All instructions must take as much time as the slowest
•
Here, cycle time for load is longer than needed for all other instructions.
–
Cycle time must be long enough for the load instruction:
PC’s Clock -to-Q + Instruction Memory Access Time +
Register File Access Time + ALU Delay (address calculation) +
Data Memory Access Time + Register File Setup Time + Clock Skew
– Real memory is not as well-behaved as idealized memory
•
Cannot always complete data access in one (short) cycle.
2. Impossible to implement complex, variable-length instructions and
complex addressing modes in a single cycle.
– e.g indirect memory addressing. e.g R[$1]  Mem[ Mem[$2] ]
3. High and duplicate hardware resource requirements
– Any hardware functional unit cannot be used more than once in
a single cycle (e.g. ALUs).
4. Does not allow overlap of instruction processing (instruction
pipelining, chapter 6).
EECC550 - Shaaban
#66 Lec # 4 Winter 2012 12-11-2012
Download