Building the Beta Beta Kit ALU

advertisement
Building the Beta
A
ALU
n
a Ki strumctoiory
IntMe
0
Bet
B
A
1
D
6.004 – Fall 2002
10/29/0
2
L15 – Building a Beta 1
CPU Design Tradeoffs
Maximum Performance: measured by the numbers of
instructions executed per second
Minimum Cost : measured by the size of the circuit.
Best Performance/Price: measured by the ratio of MIPS to
size. In power-sensitive applications MIPS/Watt is
important too.
6.004 – Fall 2002
10/29/0
2
L15 – Building a Beta 2
Performance Measure
Millions of Instructions per Second
MIPS =
Clock Frequency (MHz)
C.P.I.
Clocks per instruction
PUSHING PERFORMANCE ...
TODAY: 1 cycle/inst.
NEXT TIME: more MHz via pipelining
NEXT NEXT TIME: fixing various pipelining issues
6.004 – Fall 2002
10/29/0
2
L15 – Building a Beta 3
The Beta ISA
OPCODE
6
5
5
5
11
16
10xxxx
Rc
Ra
Rb
(unused)
Operate class: Reg[Rc] ← Reg[Ra] op Reg[Rb]
11xxxx
Rc
Ra
literal C (signed)
Operate class: Reg[Rc] ← Reg[Ra] op SXT(C)
Opcodes, both formats:
ADD
SUB
CMPEQ CMPLE
AND
OR
SHL
SHR
01xxxx
Rc
LD:
ST:
JMP:
BEQ:
BNE:
LDR:
6.004 – Fall 2002
Ra
MUL*
CMPLT
XOR
SRA
DIV*
*optional
literal C (signed)
Instruction
Instructionclasses
classes
distinguished
distinguishedby
by
OPCODE:
OPCODE:
OP
OP
OPC
OPC
MEM
MEM
Transfer
TransferofofControl
Control
Reg[Rc] ← Mem[Reg[Ra]+SXT(C)]
Mem[Reg[Ra]+SXT(C)] ← Reg[Rc]
Reg[Rc] ← PC+4; PC ← Reg[Ra]
Reg[Rc] ← PC+4; if Reg[Ra]=0 then PC ← PC+4+4*SXT(C)
Reg[Rc] ← PC+4; if Reg[Ra]≠0 then PC ← PC+4+4*SXT(C)
Reg[Rc] ← Mem[PC + 4 + 4*SXT(C)]
10/29/0
2
L15 – Building a Beta 4
Approach: Incremental Featurism
Each instruction class can be implemented using a simple component
repertoire. We’ll try implementing data paths for each class individually,
and merge them (using MUXes, etc).
Steps:
1. Operate instructions
2. Load & Store Instructions
3. Jump & Branch instructions
4. Exceptions
5. Merge data paths
Our Bag of Components:
Registers
1
0
A
ALU
RA1
WA
WE
WD
Muxes
B
“Black box” ALU
RA2
Register
File
(3-port)
A
D
WD
Instruction
Memory
A
RD
Data
Memory
R/W
RD1
6.004 – Fall 2002
10/29/0
2
RD2
Memories
L15 – Building a Beta 5
Multi-Port Register Files
Write
Port
(independent Read addresses)
dest
Write Address
EN
EN
clk
EN
…
EN
5
CLK
bsel
asel
5
RA1
RA2
WA
WE
Write Enable
Write Data
5
32
WD
Register
File
(3-port)
RD1
32
Read
Port A
RD2
32
(Independent Read Data)
Read
Port B
D
EN
s1 0
clk
D
Q
2 combinational READ ports*,
1 clocked WRITE port
*internal logic ensures Reg[31] reads as 0
Q
6.004 – Fall 2002
10/29/0
2
L15 – Building a Beta 6
Register File Timing
2 combinational READ ports, 1 clocked WRITE port
RA
A
RD
Reg[A]
new Reg[A]
tPD
tPD
CLK
WE
A
WA
WD
new Reg[A]
tS th
What if (say) WA=RA1???
RD1 reads “old” value of Reg[RA1] until next clock edge!
6.004 – Fall 2002
10/29/0
2
L15 – Building a Beta 7
Starting point: ALU Ops
32-bit (4-byte) ADD instruction:
10000000100000100001100000000000
OpCode
Rc
Ra
Rb
(unused)
Means, to BETA, Reg[R4] ← Reg[R2] + Reg[R3]
First, we’ll need hardware to:
• Read next 32-bit instruction
• DECODE instruction: ADD, SUB, XOR, etc
• READ operands (Ra, Rb) from Register File;
• PERFORM indicated operation;
• WRITE result back into Register File (Rc).
6.004 – Fall 2002
10/29/0
2
L15 – Building a Beta 8
Instruction Fetch/Decode
• Use a counter to FETCH the next instruction:
PROGRAM COUNTER (PC)
PC
00
A Instruction
32
+4
Memory
D
32
32
INSTRUCTION
WORD
FIELDS
OPCODE <31:26>
Control Logic
CONTROL SIGNALS
6.004 – Fall 2002
10/29/0
2
• use PC as memory address
• add 4 to PC, load new value
at end of cycle
• fetch instruction from
memory
º use some instruction
fields directly (register
numbers, 16-bit
constant)
º use bits <31:26> to
generate controls
L15 – Building a Beta 9
ALU Op Data Path
PC
1 0 XXXX
00
A
Instruction
Memory
Rc
Ra
Rb
unused
Reg[Rc] ← Reg[Ra] op Reg[Rb]
OP:
D
+4
Ra: <20:16>
Rc: <25:21>
Rb: <15:11>
RA1
RA2
Register
File
WA
RD1
WD
RD2
32
WE
WERF
32
Control Logic
A
ALUFN
ALU
B
ALUFN
WERF
32
6.004 – Fall 2002
10/29/0
2
L15 – Building a Beta 10
ALU Operations (w/constant)
1 1 XXXX
PC
OPC:
00
A
Rc
Ra
literal C (signed)
Reg[Rc] ← Reg[Ra] op SXT(C)
Instruction
Memory
D
+4
Ra: <20:16>
RA1
Rc: <25:21>
WA
WA
RD1
Rb: <15:11>
Register
File
RA2
WD
RD2
WE
WERF
C: SXT(<15:0>)
32
1
0
BSEL
Control Logic
A
BSEL
ALUFN
ALU
B
ALUFN
WERF
6.004 – Fall 2002
10/29/0
2
L15 – Building a Beta 11
Load Instruction
011000
PC
LD:
00
A
+4
Rc
Ra
literal C (signed)
Reg[Rc] ← Mem[Reg[Ra]+SXT(C)]
Instruction
Memory
D
Ra: <20:16>
Rb: <15:11>
Register
File
RA1
Rc: <25:21>
WA
WA
RD1
RA2
WD
RD2
WE
WERF
C: SXT(<15:0>)
1
0
BSEL
Control Logic
A
BSEL
WDSEL
ALUFN
Wr
WERF
ALUFN
ALU
B
WD
R/W
Wr
Data Memory
32
Adr
RD
32
0
6.004 – Fall 2002
10/29/0
2
1
2
WDSEL
L15 – Building a Beta 12
Store Instruction
011001
PC
Rc
00
A
literal C (signed)
Mem[Reg[Ra]+SXT(C)] ← Reg[Rc]
ST:
Instruction
Memory
Ra
D
+4
Ra: <20:16>
Rc: <25:21>
Rb: <15:11>
0
Register
File
RA1
Rc: <25:21>
WA
WA
RD1
1
RA2SEL
RA2
WD
RD2
WE
WERF
C: SXT(<15:0>)
1
0
BSEL
Control Logic
32
RA2SEL
A
BSEL
WDSEL
ALUFN
Wr
ALUFN
ALU
B
WD
R/W
Wr
Data Memory
Adr
RD
WERF
0
6.004 – Fall 2002
10/29/0
2
1
2
WDSEL
L15 – Building a Beta 13
JMP Instruction
JT
PCSEL
4
3
2
PC
1
0
011011
Rc
JMP:
00
A
Ra
literal C (signed)
Reg[Rc] ← PC+4; PC ← Reg[Ra]
Instruction
Memory
D
+4
Ra: <20:16>
Rc: <25:21>
Rb: <15:11>
0
Register
File
RA1
Rc: <25:21>
WA
WA
RD1
C: SXT(<15:0>)
1
RA2SEL
RA2
WD
RD2
WE
WERF
JT
1
0
BSEL
Control Logic
PCSEL
RA2SEL
A
BSEL
WDSEL
ALUFN
Wr
ALUFN
ALU
B
WD
R/W
Wr
Data Memory
Adr
RD
WERF
PC+4
32
0
6.004 – Fall 2002
10/29/0
2
1
2
WDSEL
L15 – Building a Beta 14
BEQ/BNE Instructions
JT
PCSEL
4
3
011101
2
1
0
Rc
Ra
literal C (signed)
Reg[Rc] ← PC+4; if Reg[Ra]=0 then PC ← PC+4+4*SXT(C)
BEQ:
32
PC
00
011110
Instruction
Memory
A
Rc
Ra
literal C (signed)
Reg[Rc] ← PC+4; if Reg[Ra]≠0 then PC ← PC+4+4*SXT(C)
BNE:
D
+4
Ra: <20:16>
Rb: <15:11>
Rc: <25:21>
0
+
PC+4+4*SXT(C)
4*SXT(<15:0>)
Z
Register
File
RA1
Rc: <25:21>
WA
WA
RD1
Z
1
RA2SEL
RA2
WD
RD2
WE
WERF
JT
C: SXT(<15:0>)
1
0
BSEL
Control Logic
PCSEL
RA2SEL
BSEL
WDSEL
ALUFN
Wr
WERF
A
ALUFN
ALU
B
WD
R/W
Wr
Data Memory
Adr
RD
PC+4
0
6.004 – Fall 2002
10/29/0
2
1
2
WDSEL
L15 – Building a Beta 15
Load Relative Instruction
011111
LDR:
Rc
Ra
literal C (signed)
Reg[Rc] ← Mem[PC + 4+ 4*SXT(C)]
Hey, WAIT A MINUTE. What’s Load Relative good for anyway??? I
thought
• Code is “PURE”, i.e. READ-ONLY; and stored in a “PROGRAM”
region of memory;
• Data is READ-WRITE, and stored either
• On the STACK (local); or
• In some GLOBAL VARIABLE region; or
C:
C:
BETA:
BETA:
• In a global storage HEAP.
So why an instruction designed to load data
that’s “near” the instruction???
Addresses & other large constants
6.004 – Fall 2002
10/29/0
2
c1:
c1:
XX == XX ** 123456;
123456;
LD(X,
LD(X, r0)
r0)
LDR(c1,
LDR(c1, r1)
r1)
MUL(r0,
r1,
MUL(r0, r1, r0)
r0)
ST(r0,
X)
ST(r0, X)
...
...
LONG(123456)
LONG(123456)
L15 – Building a Beta 16
LDR Instruction
JT
PCSEL
4
3
2
1
IF
PC
011111
0
Rc
literal C (signed)
Reg[Rc] ← Mem[PC + 4 + 4*SXT(C)]
LDR:
00
Ra
Instruction
Memory
A
D
+4
Ra: <20:16>
Rc: <25:21>
Rb: <15:11>
0
+
Register
File
RA1
Rc: <25:21>
WA
WA
RD1
Z
PC+4+4*SXT(C)
1
RA2SEL
RA2
WD
RD2
WE
WERF
JT
C:SXT( <15:0>)
Z
ASEL
1
0
1
0
BSEL
Control Logic
PCSEL
RA2SEL
ASEL
BSEL
WDSEL
ALUFN
Wr
WERF
A
ALUFN
ALU
B
WD
R/W
Wr
Data Memory
Adr
RD
PC+4
0
6.004 – Fall 2002
10/29/0
2
1
2
WDSEL
L15 – Building a Beta 17
Exception Processing
Plan:
• Interrupt running program
• Invoke exception handler (like a procedure call)
• Return to continue execution.
We’d like RECOVERABLE INTERRUPTS for
• Synchronous events, generated by CPU or system
FAULTS (eg, Illegal Instruction, divide-by-0, illegal mem address)
TRAPS & system calls (eg, read-a-character)
• Asynchronous events, generated by I/O
(eg, key struck, packet received, disk transfer complete)
KEY: TRANSPARENCY to interrupted program.
• Most difficult for asynchronous interrupts
6.004 – Fall 2002
10/29/0
2
L15 – Building a Beta 18
Implementation…
How exceptions work:
• Don’t execute current instruction
• Instead fake a “forced” procedure call
• save current PC (actually current PC + 4)
• load PC with exception vector
• 0x4 for synch. exception, 0x8 for asynch. exceptions
Question: where to save current PC + 4?
• Our approach: reserve a register (R30, aka XP)
• Prohibit user programs from using XP. Why?
IllOp:
PUSH(XP)
Example: DIV unimplemented
LD(R31,A,R0)
LD(R31,B,R1)
DIV(R0,R1,R2)
ST(R2,C,R31)
Forced by
hardware
Fetch inst. at Mem[Reg[XP]–4]
check for DIV opcode, get reg numbers
perform operation in SW, fill result reg
POP(XP)
JMP(XP)
6.004 – Fall 2002
10/29/0
2
L15 – Building a Beta 19
ILL
XAdr OP
PCSEL
4
3
Exceptions
JT
2
PC
1
0
Bad Opcode: Reg[XP] ← PC+4; PC ← “IllOp”
Other:
Reg[XP] ← PC+4; PC ←“Xadr”
00
A
Instruction
Memory
D
+4
Ra: <20:16>
0
+
1
WASEL
XP
Rc: <25:21>
PC+4+4*SXT(C)
Register
File
RA1
1
WA
WA
0
RD1
Z
IRQ
Rc: <25:21>
Rb: <15:11>
RA2SEL
RA2
WD
RD2
WE
WERF
JT
C: SXT(<15:0>)
Z
ASEL
1
0
1
0
BSEL
Control Logic
PCSEL
RA2SEL
ASEL
BSEL
WDSEL
ALUFN
Wr
WERF
WASEL
A
ALUFN
ALU
WD
R/W
Wr
Data Memory
Adr
RD
PC+4
0
6.004 – Fall 2002
B
10/29/0
2
1
2
WDSEL
L15 – Building a Beta 20
—
1
—
0
0
—
4
—
1
LDR
—
“A” —
1
1
1
—
— —
0
2
0
0
0
0
—
— —
Z ? 0 : 1 0
3
—
1 —
0
0
1
BNE
trap
—
1
—
0
0
—
Z ? 1 : 0
—
0
Illop
ALUFN F(op) F(op) “+” “+” —
WERF
1
1
1
0
1
BSEL
0
1
1
1 —
WDSEL
1
1
2 — 0
WR
0
0
0
1
0
RA2SEL
0
—
— 1 —
PCSEL
0
0
0
0
2
ASEL
0
0
0
0 —
WASEL
0
0
0 — 0
BEQ
JMP
ST
LD
OPC
OP
Control Logic
Implementation choices:
• ROM indexed by opcode, external branch & trap logic
• PLA
• “random” logic (eg, standard cell gates)
6.004 – Fall 2002
10/29/0
2
L15 – Building a Beta 21
Beta: Our “Final Answer”
ILL
XAdr OP
PCSEL
4
3
JT
2
PC
1
0
00
A
Instruction
Memory
D
+4
Ra: <20:16>
0
+
1
WASEL
XP
Rc: <25:21>
PC+4+4*SXT(C)
Register
File
RA1
1
WA
WA
0
RD1
Z
IRQ
Rc: <25:21>
Rb: <15:11>
RA2SEL
RA2
WD
RD2
WE
WERF
JT
C: SXT(<15:0>)
Z
ASEL
1
0
1
0
BSEL
Control Logic
PCSEL
RA2SEL
ASEL
BSEL
WDSEL
ALUFN
Wr
WERF
WASEL
A
ALUFN
ALU
WD
R/W
Wr
Data Memory
Adr
RD
PC+4
0
6.004 – Fall 2002
B
10/29/0
2
1
2
WDSEL
L15 – Building a Beta 22
Next Time: Pipelined Betas
Hey, building a
computer is not
really all that
difficult
6.004 – Fall 2002
So let’s run down
to the 6.004 lab
and put Intel out
of business!
10/29/0
2
L15 – Building a Beta 23
Download