6.004 Computation Structures

advertisement
MIT OpenCourseWare
http://ocw.mit.edu
6.004 Computation Structures
Spring 2009
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Building the Beta
CPU Design Tradeoffs
Maximum Performance: measured by the
numbers of instructions executed per
second
Minimum Cost : measured by the size
of the circuit.
Best Performance/Price: measured by the
ratio of MIPS to size. In power-sensitive
applications MIPS/Watt is important too.
Figure by MIT OpenCourseWare.
Lab #5 due Thursday
3/31/09
6.004 – Spring 2009
L14 – Building a Beta 1
Performance Measure
3/31/09
6.004 – Spring 2009
L14 – Building a Beta 2
The Beta ISA
6
OpCode
Millions of Instructions per Second
MIPS =
6
5
5
5
11
10 X X X X
Rc
Ra
Rb
(UNUSED)
Operate class: Reg[Rc] Reg[Ra] op Reg[Rb]
Clock Frequency (MHz)
16
C.P.I.
11 X X X X
Rc
Ra
Literal C (signed)
Operate class: Reg[Rc] Reg[Ra] op SXT(C)
Opcodes, both formats:
ADD SUB
MUL* DIV* *optional
CMPEQ CMPLE CMPLT
AND OR
XOR
SHL
SHR
SRA
Clocks per instruction
PUSHING PERFORMANCE ...
01 X X X X
LD:
ST:
JMP:
BEQ:
BNE:
LDR:
TODAY: 1 cycle/inst.
LATER: more MHz via pipelining
6.004 – Spring 2009
3/31/09
Instruction classes
distinguished by
OPCODE:
OP
OPC
MEM
Transfer of Control
L14 – Building a Beta 3
6.004 – Spring 2009
Rc
Ra
Literal C (signed)
Reg[Rc] Mem[Reg[Ra]+SXT(C)]
Mem[Reg[Ra]+SXT(C)] Reg[Rc]
Reg[Rc] PC+4; PC Reg[Ra]
Reg[Rc] PC+4; if Reg[Ra]=0 then PC PC+4+4*SXT(C)
Reg[Rc] PC+4; if Reg[Ra]0 then PC PC+4+4*SXT(C)
Reg[Rc] Mem[PC + 4 + 4*SXT(C)]
3/31/09
L14 – Building a Beta 4
Approach: Incremental Featurism
Multi-Port Register Files
Write
Port
(independent Read addresses)
Each instruction class can be implemented using a simple component
repertoire. We’ll try implementing data paths for each class individually,
and merge them (using MUXes, etc).
dest
5
Write Address
EN
clk
Our Bag of Components:
Steps:
1. Operate instructions
2. Load & Store Instructions
3. Jump & Branch instructions
4. Exceptions
5. Merge data paths
EN
EN
…
Write Data
A
B
bsel
WD
Read
Port A
“Black box” ALU
WD
A
D
Instruction
Memory
A
RD
Data
Memory
EN
s1 0
clk
D
Q
RD2
32
2 combinational READ ports*,
1 clocked WRITE port
*internal logic ensures Reg[31] reads as 0
RD2
Memories
3/31/09
Q
L14 – Building a Beta 5
3/31/09
6.004 – Spring 2009
Register File Timing
L14 – Building a Beta 6
Starting point: ALU Ops
2 combinational READ ports, 1 clocked WRITE port
RA
RD1
(Independent Read Data)
Read
Port B
D
R/W
RD1
WD
Register
File
(3-port)
32
RA2
Register
File
(3-port)
WE
WA
Muxes
ALU
WA
32
CLK
asel
1
0
5
RA2
WE
Write Enable
Registers
RA1
6.004 – Spring 2009
EN
5
RA1
32-bit (4-byte) ADD instruction:
A
10000000100000100001100000000000
RD
Reg[A]
new Reg[A]
tPD
OpCode
Rc
Ra
Rb
(unused)
tPD
Means, to BETA, Reg[R4] Reg[R2] + Reg[R3]
CLK
WE
First, we’ll need hardware to:
• Read next 32-bit instruction
• DECODE instruction: ADD, SUB, XOR, etc
• READ operands (Ra, Rb) from Register File;
• PERFORM indicated operation;
• WRITE result back into Register File (Rc).
A
WA
WD
new Reg[A]
tS th
What if (say) WA=RA1???
RD1 reads “old” value of Reg[RA1] until next clock edge!
6.004 – Spring 2009
3/31/09
L14 – Building a Beta 7
6.004 – Spring 2009
3/31/09
L14 – Building a Beta 8
Instruction Fetch/Decode
ALU Op Data Path
• Use a counter to FETCH the next instruction:
PROGRAM COUNTER (PC)
PC
00
A Instruction
Memory
32
D
+4
32
32
OPCODE <31:26>
Control Logic
• use PC as memory address
• add 4 to PC, load new value at
end of cycle
• fetch instruction from memory
INSTRUCTION º use some instruction fields
directly (register numbers,
WORD
FIELDS
16-bit constant)
º use bits <31:26> to
generate controls
Ra
Rb
(UNUSED)
Operate class: Reg[Rc] Reg[Ra] op Reg[Rb]
00
A
PC
Rc
10 X X X X
Instruction
Memory
D
+4
Ra: <20:16>
Rb: <15:11>
RA1
Rc: <25:21>
RA2
Register
File
WA
RD1
WD
RD2
32
WE
WERF
32
Control Logic
A
B
ALU
ALUFN
CONTROL SIGNALS
ALUFN
WERF!
3/31/09
6.004 – Spring 2009
L14 – Building a Beta 9
PC
Rc
Ra
3/31/09
A
Literal C (signed)
01 10 00
PC
+4
RA1
Rc: <25:21>
WA
WA
RD1
D
Ra: <20:16>
RA2
WE
Rb: <15:11>
Register
File
RA1
Rc: <25:21>
WD
RD2
Literal C (signed)
Instruction
Memory
Rb: <15:11>
Register
File
Ra
00
A
D
Ra: <20:16>
Rc
LD: Reg[Rc] Mem[Reg[Ra]+SXT(C)]
Instruction
Memory
+4
L14 – Building a Beta 10
Load Instruction
Operate class: Reg[Rc] Reg[Ra] op SXT(C)
00
32
6.004 – Spring 2009
ALU Operations (w/constant)
11 X X X X
WERF
WA
WA
WERF
RD1
C: SXT(<15:0>)
RA2
WD
RD2
WE
WERF
C: SXT(<15:0>)
32
1
0
BSEL
1
Control Logic
A
BSEL
0
BSEL
Control Logic
B
A
ALU
ALUFN
BSEL
WDSEL
ALUFN
Wr
WERF
ALUFN
WERF
B
ALU
ALUFN
WD
R/W
Wr
Data Memory
32
Adr
RD
32
0
6.004 – Spring 2009
3/31/09
L14 – Building a Beta 11
6.004 – Spring 2009
3/31/09
1
2
WDSEL
L14 – Building a Beta 12
Store Instruction
01 10 01
Rc
JMP Instruction
Ra
Literal C (signed)
ST: Mem[Reg[Ra]+SXT(C)] Reg[Rc]
JT
PCSEL
4
3
2
00
PC
PC
A
Instruction
Memory
01 10 11
1
0
A
D
Rc: <25:21>
Rb: <15:11>
0
Register
File
RA1
Rc: <25:21>
WA
WA
RD1
1
Ra: <20:16>
RA2SEL
No WERF!
WD
RD2
WE
WA
WA
WERF
RD1
BSEL
0
1
Control Logic
WD
RD2
WE
WERF
0
BSEL
PCSEL
RA2SEL
RA2SEL
B
ALU
ALUFN
RA2SEL
Control Logic
32
BSEL
WDSEL
ALUFN
Wr
1
RA2
JT
C: SXT(<15:0>)
A
Register
File
RA1
Rc: <25:21>
C: SXT(<15:0>)
1
Rc: <25:21>
Rb: <15:11>
0
RA2
Literal C (signed)
Instruction
Memory
+4
Ra: <20:16>
Ra
00
D
+4
Rc
JMP: Reg[Rc] PC+4; PC Reg[Ra]
WD
R/W
A
Wr
BSEL
WDSEL
ALUFN
Wr
Data Memory
Adr
RD
WERF
B
ALU
ALUFN
WD
R/W
Wr
Data Memory
Adr
RD
WERF
PC+4
0
1
2
32
WDSEL
0
3/31/09
6.004 – Spring 2009
L14 – Building a Beta 13
3
2
1
0
32
PC
01 11 10
00
Instruction
Memory
A
Rc
Ra
L14 – Building a Beta 14
01 11 11
Rc
Ra
Literal C (signed)
LDR: Reg[Rc] Mem[PC + 4 + 4*SXT(C)]
Literal C (signed)
BNE: Reg[Rc] PC+4; if Reg[Ra]0 then PC PC+4+4*SXT(C)
D
+4
Ra: <20:16>
Rb: <15:11>
PC+4+4*SXT(C)
+
4*SXT(<15:0>)
Z
Register
File
RA1
Rc: <25:21>
WA
WA
RD1
Z
Hey, WAIT A MINUTE. What’s Load Relative good for anyway??? I
Rc: <25:21>
0
1
thought
RA2SEL
RA2
• Code is “PURE”, i.e. READ-ONLY; and stored in a “PROGRAM” region of
memory;
WD
RD2
WE
WERF
JT
C: SXT(<15:0>)
1
0
• Data is READ-WRITE, and stored either
BSEL
Control Logic
• On the STACK (local); or
PCSEL
RA2SEL
BSEL
WDSEL
ALUFN
Wr
WERF
A
ALUFN
WD
R/W
Wr
RD
So why an instruction designed to load data
that’s “near” the instruction???
Addresses & other large constants
0
3/31/09
1
2
WDSEL
L14 – Building a Beta 15
6.004 – Spring 2009
3/31/09
X = X * 123456;
BETA:
• In a global storage HEAP.
Data Memory
Adr
C:
• In some GLOBAL VARIABLE region; or
B
ALU
PC+4
6.004 – Spring 2009
WDSEL
Load Relative Instruction
0 1 1 1 0 1 Rc
Ra
Literal C (signed)
BEQ: Reg[Rc] PC+4; if Reg[Ra]=0 then PC PC+4+4*SXT(C)
JT
4
2
3/31/09
6.004 – Spring 2009
BEQ/BNE Instructions
PCSEL
1
c1:
LD(X, r0)
LDR(c1, r1)
MUL(r0, r1, r0)
ST(r0, X)
...
LONG(123456)
L14 – Building a Beta 16
LDR Instruction
Exceptions
JT
01 11 11
PCSEL
4
3
2
1
IF
PC
Rc
Ra
Literal C (signed)
0
What if something BAD happens?
LDR: Reg[Rc] Mem[PC + 4 + 4*SXT(C)]
00
• Execution of an illegal op-code
• Reference to non-existent memory
• Divide by zero
Instruction
Memory
A
D
+4
Ra: <20:16>
Rc: <25:21>
Rb: <15:11>
0
+
Register
File
RA1
Rc: <25:21>
WA
WA
RD1
Z
1
Or, maybe, just something unanticipated…
RA2SEL
RA2
• User hits a key
• A packet comes in via the network
WD
RD2
WE
WERF
JT
C:SXT( <15:0>)
PC+4+4*SXT(C)
Z
ASEL
1
0
1
0
BSEL
GOAL: handle all these cases (and more) in SOFTWARE:
Control Logic
PCSEL
RA2SEL
ASEL
BSEL
WDSEL
ALUFN
Wr
WERF
A
•
•
•
•
B
ALU
ALUFN
WD
R/W
Wr
Data Memory
Adr
RD
PC+4
0
1
2
WDSEL
3/31/09
6.004 – Spring 2009
Treat each such case as an (implicit) procedure call…
Procedure handles problem, returns to interrupted program.
TRANSPARENT to interrupted program!
Important added capability: handlers for certain errors (illegal opcodes) can extend instruction set using software (Lab 7!).
L14 – Building a Beta 17
3/31/09
6.004 – Spring 2009
Exception Processing
Implementation…
How exceptions work:
• Don’t execute current instruction
• Instead fake a “forced” procedure call
• save current PC (actually current PC + 4)
• load PC with exception vector
• 0x4 for synch. exception, 0x8 for asynch. exceptions
Plan:
• Interrupt running program
• Invoke exception handler (like a procedure call)
• Return to continue execution.
We’d like RECOVERABLE INTERRUPTS for
• Synchronous events, generated by CPU or system
FAULTS (eg, Illegal Instruction, divide-by-0, illegal mem address)
TRAPS & system calls (eg, read-a-character)
Question: where to save current PC + 4?
• Our approach: reserve a register (R30, aka XP)
• Prohibit user programs from using XP. Why?
IllOp:
PUSH(XP)
Example: DIV unimplemented
LD(R31,A,R0)
LD(R31,B,R1)
DIV(R0,R1,R2)
ST(R2,C,R31)
• Asynchronous events, generated by I/O
(eg, key struck, packet received, disk transfer complete)
Forced by
hardware
KEY: TRANSPARENCY to interrupted program.
3/31/09
Fetch inst. at Mem[Reg[XP]–4]
check for DIV opcode, get reg numbers
perform operation in SW, fill result reg
POP(XP)
JMP(XP)
• Most difficult for asynchronous interrupts
6.004 – Spring 2009
L14 – Building a Beta 18
L14 – Building a Beta 19
6.004 – Spring 2009
3/31/09
L14 – Building a Beta 20
Instruction
Memory
D
+4
Ra: <20:16>
1
RA2SEL
WASEL
XP
WA
WA
RD1
Z
RA2
WD
RD2
WE
WERF
JT
C: SXT(<15:0>)
PC+4+4*SXT(C)
Register
File
RA1
1
Rc: <25:21> 0
IRQ
Rc: <25:21>
Rb: <15:11>
0
+
Z
ASEL
1
0
1
BSEL
0
Control Logic
PCSEL
RA2SEL
ASEL
BSEL
WDSEL
ALUFN
Wr
WERF
A
WD
R/W
Adr
1
2
WDSEL
3/31/09
L14 – Building a Beta 21
Beta: Our “Final Answer”
4
3
-1
-0
0
-4
-1
Implementation choices:
• ROM indexed by opcode, external branch & trap logic
• PLA
• “random” logic (eg, standard cell gates)
RD
PC+4
0
PCSEL
-"A" -1
1 1
--- -0
2 0
0
0 0
--- -Z?0:1 0 3
-1 -0
0 1
Wr
Data Memory
6.004 – Spring 2009
-1
-0
0
-Z?1:0
-0
B
ALU
ALUFN
WASEL
ILL
XAdr OP
ALUFN F(op) F(op) "+" "+" -WERF
1
1
1
0
1
BSEL
0
1
1
1 -WDSEL
1
1
2
-- 0
WR
0
0
0
1
0
RA2SEL
0
--1 -PCSEL
0
0
0
0
2
ASEL
0
0
0
0 -WASEL
0
0
0
-- 0
IRQ
A
Illop
Bad Opcode: Reg[XP] PC+4; PC “IllOp”
Other:
Reg[XP] PC+4; PC “Xadr”
00
LDR
0
BNE
1
PC
BEQ
2
JMP
JT
ST
3
LD
4
OPC
PCSEL
Control Logic
Exceptions
OP
ILL
XAdr OP
3/31/09
6.004 – Spring 2009
L14 – Building a Beta 22
Next Time: Tackling the Memory Bottleneck
JT
2
PC
1
0
A
Instruction
Memory
D
+4
Ra: <20:16>
1
RA2SEL
WASEL
XP
WA
WA
RD1
Z
PC+4+4*SXT(C)
Register
File
RA1
1
Rc: <25:21> 0
IRQ
Rc: <25:21>
Rb: <15:11>
0
+
No.
You’ve gotta print
up all those little
“Beta Inside”
stickers.
Is that all
there is to
building a
processor???
00
RA2
WD
RD2
WE
WERF
JT
C: SXT(<15:0>)
Z
ASEL
1
0
1
0
BSEL
Control Logic
PCSEL
RA2SEL
ASEL
BSEL
WDSEL
ALUFN
Wr
WERF
WASEL
A
B
ALU
ALUFN
Adr
R/W
Wr
RD
PC+4
0
6.004 – Spring 2009
WD
Data Memory
3/31/09
1
2
WDSEL
L14 – Building a Beta 23
6.004 – Spring 2009
3/31/09
L14 – Building a Beta 24
Download