Last time Today Why a fictitious ISA? Y86 Processor State Y86

advertisement
Last time
Lecture 11:
Sequential processor design
Computer Architecture and
Systems Programming
(252-0061-00)
• Operators
• Function pointers
• Typedefs and structures
• goto
• Assertions
Timothy Roscoe
Herbstsemester 2012
• Are arrays the same as pointers?
• setjmp()/longjmp()
• Coroutines
© Systems Group | Department of Computer Science | ETH Zürich
Today
Why a fictitious ISA?
• Y86 instruction set architecture
• Simpler than X86, but close to it
– Processor state
– Instruction encoding
– Compiling, assembling, and simulation
• Sequential processor design
– Fetch, decode, execute, memory, write back, PC update
– Performance implications
• Illustrate instruction encoding with concrete
examples
• Basis for assembler / simulator exercises
• Simple enough to design
– Sequential processor (this time)
– Pipelined processor (next that)
Y86 Processor State
Program registers
%eax
%ecx
%edx
%ebx
%esi
%edi
%esp
p
%ebp
Condition
codes
od
des
ess
Memory
• Format
OF
F ZF
F SF
– 1-6 bytes of information read from memory
PC
• Can determine instruction length from first byte
• Not as many instruction types, and simpler encoding than with
IA32
– Program registers
• Same 8 as with IA32. Each 32 bits
– Condition codes
• Single-bit flags set by arithmetic or logical instructions
– OF: Overflow
Y86 Instructions
ZF: Zero
SF:Negative
– Program counter
• Indicates address of instruction
– Memory
• Byte-addressable storage array
• Words stored in little-endian byte order
– Each accesses and modifies some part(s) of the program
state
Encoding Registers
Instruction Example
Addition instruction
• Each register has 4-bit ID
Generic Form
Encoded Representation
%eax
%ecx
%edx
%ebx
0
1
2
3
%esi
%edi
%esp
p
%ebp
6
7
4
5
addl rA, rB
6 0 rA
A rB
– Add value in register rA to that in register rB
– Same encoding as in IA32
• Store result in register rB
• Note that Y86 only allows addition to be applied to register data
• Register ID 8 indicates “no register”
– Will use this in our hardware design in multiple places
– Set condition codes based on result
– e.g., addl %eax,%esiEncoding: 60 06
– Two-byte encoding
• First indicates instruction type
• Second gives source and destination registers
Arithmetic and Logical
Operations
rrmovl rA, rB
2 0 rA
A rB
6 0 rA
A rB
• Encodings differ only by
“function code”
irmovl V, rB
3 0 8 rB
V
Immediate їRegister
– Low-order 4 bytes in first
instruction word
rmmovl rA,D(rB)
4 0 rA
A rB
D
Register їMemory
6 1 rA
A rB
mrmovl D(rB),rA
5 0 rA
A rB
D
Memory їRegister
Function code
Subtract (rA from rB)
subl rA, rB
And
andl rA, rB
• Set condition codes as
side effect
6 2 rA
A rB
Exclusive-Or
xorl rA, rB
Register їRegister
• Refer to generically as
“OPl”
Instruction code
Add
addl rA, rB
Move Operations
6 3 rA
A rB
– Like the IA32 movl instruction
– Simpler format for memory addresses
– Give different names to keep them distinct
Move Instruction Examples
Jump Instructions
Jump Unconditionally
IA32
Y86
Encoding
movl $0xabcd, %edx
irmovl $0xabcd, %edx
30 82 cd ab 00 00
movl %esp, %ebx
rrmovl %esp, %ebx
20 43
movl -12(%ebp),%ecx
mrmovl -12(%ebp),%ecx
50 15 f4 ff ff ff
movl %esi,0x41c(%esp)
rmmovl %esi,0x41c(%esp)
40 64 1c 04 00 00
jmp Dest
7 0
Dest
Jump When
h Less Than
h or EEquall
jle Dest
7 1
Dest
7 2
Dest
7 3
Dest
7 4
Dest
Jump When
h Less Than
h
jl Dest
Jump When
h Equall
—
je Dest
movl %eax, 12(%eax,%edx)
—
Jump When
h Not Equall
movl (%ebp,%eax,4),%ecx
—
movl $0xabcd, (%eax)
jne Dest
Jump When
h Greater Th
Than
h or EEquall
jge Dest
7 5
Dest
Jump When
h Greater Th
Than
h
jg Dest
7 6
Dest
• Refer to generically as
“jXX”
• Encodings differ only
by “function code”
• Based on values of
condition codes
• Same as IA32
counterparts
• Encode full
destination address
– Unlike PC-relative
addressing seen in
IA32
Y86 Program Stack
Stack “Bottom”
– Region of memory holding
program data
– Used in Y86 (and IA32) for
supporting procedure calls
– Stack top indicated by %esp
pushl rA
– Stack grows toward lower
addresses
popl rA
• Top element is at highest
address in the stack
• When pushing, must first
decrement stack pointer
• When popping, increment
stack pointer
%esp
a 0 rA
A 8
– Decrement %esp by 4
– Store word from rA to memory at %esp
– Like IA32
• Address of top stack element
•
•
•
Increasing
Addresses
Stack Operations
–
–
–
–
b 0 rA
A 8
Read word from memory at %esp
Save in rA
Increment %esp by 4
Like IA32
Stack “Top”
Subroutine Call and Return
call Dest
Miscellaneous Instructions
Dest
8 0
nop
– Push address of next instruction onto stack
– Start executing instructions at Dest
– Like IA32
– Don’t do anything
halt
ret
9 0
Writing Y86 Code
Y86 code generation example
•
First try
– Write typical array code
– Write code in C
– Compile for IA32 with gcc -S
– Transliterate into Y86
• Coding example
– Find number of elements in null-terminated list
int len1(int a[]);
a
1 0
– Stop executing instructions
– IA32 has comparable instruction, but can’t execute it in
user mode
– We will use it to stop the simulator
– Pop value from stack
– Use as address for next instruction
– Like IA32
• Try to use C compiler as much as possible
0 0
/* Find number of elements in
null-terminated list */
int len1(int a[])
{
int len;
for (len = 0; a[len]; len++)
;
return len;
}
5043
5
6125
7395
0
Ÿ 3
– Compile with gcc -O2 -S
•
Problem
– Hard to do array indexing on Y86
• Since don’t have scaled
addressing modes
L18:
incl %eax
cmpl $0,(%edx,%eax,4)
jne L18
Y86 code generation example #2
•
Second try
•
– Write with pointer code
Result
•
– Don’t need to do indexed
addressing
/* Find number of elements in
null-terminated list */
int len2(int a[])
{
int len = 0;
while (*a++)
len++;
return len;
}
Y86 code generation example #3
L24:
movl (%edx),%eax
incl %ecx
L26:
addl $4,%edx
testl %eax,%eax
jne L24
IA32 code
•
– Setup
Y86 code
– Setup
len2:
pushl %ebp
#
xorl %ecx,%ecx
#
rrmovl %esp,%ebp
#
mrmovl 8(%ebp),%edx #
mrmovl (%edx),%eax #
jmp L26
#
len2:
pushl %ebp
xorl %ecx,%ecx
movl %esp,%ebp
movl 8(%ebp),%edx
movl (%edx),%eax
jmp L26
Save %ebp
len = 0
Set frame
Get a
Get *a
Goto entry
– Compile with gcc -O2 -S
Y86 code generation example #4
•
IA32 code
•
– loop + finish
Y86 code
– loop + finish
L24:
movl (%edx),%eax
incl %ecx
L24:
mrmovl (%edx),%eax
irmovl $1,%esi
addl %esi,%ecx
L26:
irmovl $4,%esi
addl %esi,%edx
andl %eax,%eax
jne L24
rrmovl %ebp,%esp
rrmovl %ecx,%eax
popl %ebp
ret
L26:
addl $4,%edx
testl %eax,%eax
jne L24
movl %ebp,%esp
movl %ecx,%eax
popl %ebp
ret
# Get *a
# len++
# Entry:
#
#
#
#
#
a++
*a == 0?
No--Loop
Pop
Rtn len
Assembling Y86 Program
unix> yas eg.ys
b3130000
ed170000
e31c0000
00000000
# Set up stack
# Set up frame
– Program starts at
address 0
– Must set up stack
# Push argument
# Call Function
# Halt
• Make sure don’t
overwrite code!
# List of elements
– Must initialize data
– Can use symbolic
names
# Function
len2:
. . .
# Allocate space for stack
.pos 0x100
Stack:
Simulating Y86 Program
– Instruction set simulator
• Actually looks like disassembler output
308400010000
2045
308218000000
a028
8028000000
10
irmovl Stack,%esp
rrmovl %esp,%ebp
irmovl List,%edx
pushl %edx
call len2
halt
.align 4
List:
.long 5043
.long 6125
.long 7395
.long 0
unix> yis eg.yo
– Generates “object code” file eg.yo
0x000:
0x006:
0x008:
0x00e:
0x010:
0x015:
0x018:
0x018:
0x018:
0x01c:
0x020:
0x024:
Y86 Program Structure
|
|
|
|
|
|
|
|
|
|
|
|
irmovl Stack,%esp
rrmovl %esp,%ebp
irmovl List,%edx
pushl %edx
call len2
halt
.align 4
List:
.long 5043
.long 6125
.long 7395
.long 0
# Set up stack
# Set up frame
# Push argument
# Call Function
# Halt
# List of elements
• Computes effect of each instruction on processor state
• Prints changes in state from original
Stopped in 41 steps at PC = 0x16. Exception 'HLT', CC Z=1 S=0 O=0
Changes to registers:
%eax:
0x00000000
0x00000003
%ecx:
0x00000000
0x00000003
%edx:
0x00000000
0x00000028
%esp:
0x00000000
0x000000fc
%ebp:
0x00000000
0x00000100
%esi:
0x00000000
0x00000004
Changes to memory:
0x00f4:
0x00f8:
0x00fc:
0x00000000
0x00000000
0x00000000
0x00000100
0x00000015
0x00000018
Summary
Y86 Instruction Set
• Y86 instruction set architecture (ISA)
– Similar state and instructions as IA32
– Simpler encodings
– Somewhere between CISC and RISC
• How important is ISA design?
– Less now than before
• With enough hardware, can make almost anything go fast
• Added complexity of little-used instructions is not many transistors
– Internally, IA32 is almost RISC anyway
• Decode unit can generate stream of “RISC-like” instructions
Building Blocks
A
• Combinational logic
– Compute Boolean functions
of inputs
– Continuously respond to
input changes
– Operate on data and
implement control
B
Byte
nop
0 0
addl 6 0
halt
1 0
subl 6 1
rrmovl V,rB
2 0 rA
A rB
irmovl V,rB
3 0 8 rB
D
rmmovl rA,D(rB)
4 0 rA
A rB
D
mrmovl D(rB),rA
5 0 rA
A rB
D
OP1 rA,rB
6 fn
fn rA
A rB
jXX Dest
7 fn
fn
call Dest
8 0
ret
0
1
2
3
4
5
andl 6 2
xorl 6 3
jmp
7 0
jle
7 1
Dest
jl
7 2
Dest
je
7 3
9 0
jne
7 4
pushl rA
A 0 rA
A 8
jge
7 5
popl rA
B 0 rA
A 8
jg
7 6
Hardware Control Language
fun
A
L
U
– Very simple hardware description language
– Can only express limited aspects of hardware operation
=
• Parts we want to explore and modify
• Data types
0
MUX
– bool: Boolean
• a, b, c, …
1
– int: words
• A, B, C, …
• Does not specify word size---bytes, 32-bit words, …
• Storage elements
–
–
–
–
Store bits
Addressable memories
Non-addressable registers
Loaded only as clock rises
valA
srcA
A
valB
Register
file
srcB
B
• Statements
valW
–
–
W dstW
Clock
bool a = bool-expr ;
int A = int-expr ;
Clock
newPC
HCL Operations
– Classify by type of value returned
• Boolean expressions
– Logic operations
• a && b, a || b, !a
– Word comparisons
• A == B, A != B, A < B, A <= B, A >= B, A > B
– Set membership
• A in { B, C, D }
– Same as A == B || A == C || A == D
• Word expressions
– Case expressions
• [ a : A; b : B; c : C ]
• Evaluate test expressions a, b, c, … in sequence
• Return word expression A, B, C, … for first successful test
SEQ Hardware
Structure
• State
–
–
–
–
PC
valE , valM
Write backk
valM
M
Data
Data
memory
memory
Memory
Addrrr,, Data
Add
Program counter register (PC)
Condition code register (CC)
Register File
Memories
valE
Bch
Execute
CC
CC
• Access same memory space
• Data: for reading/writing program data
• Instruction: for reading instructions
• Instruction Flow
ALU
ALU
aluA
A, aluB
A,
valA
valA,
A, valB
srcA
A,, srcB
dstA
A, dstB
A,
Decode
– Read instruction at address specified by
PC
– Process through stages
Fetch
– Update program counter
icode ifun
rA , rB
valC
A
valP
,
Instruction
stt ucction
Instruction
memory
m
emo
ory
memory
PC
B
Register
Register
file
file
PC
PC
increment
increment
M
E
newPC
SEQ Stages
PC
valM
• Fetch
– Read instruction from instruction
memory
Instruction Decoding
valE , valM
Write back
Optional
p
al
Data
Data
memory
memory
Memory
Optional
Addr, Data
5 0 rA
A rB
• Decode
D
valE
– Read program registers
• Execute
Bch
Execute
CC
CC
icode
ifun
rA
A
rB
valC
ALU
ALU
aluA, aluB
– Compute value or address
valA, valB
• Memory
– Read or write data
srcA, srcB
dstA, dstB
Decode
A
icode ifun
rA , rB
valC
• PC
Fetch
M
E
valP
,
– Write program registers
B
Register
Register
file
file
• Write Back
Instruction
Instruction
memory
memory
PC
PC
increment
increment
– Update program counter
PC
Executing Arith./Logical Operation
OP1 rA, rB
• Fetch
– Do nothing
• Decode
• Write back
– Read operand registers
• Execute
– Update register
• PC Update
– Perform operation
– Set condition codes
– Instruction byte
icode:ifun
– Optional register byte rA:rB
– Optional constant word valC
Stage Computation: Arith/Log. Ops
OPl rA, rB
6 fn
fn rA
A rB
• Memory
– Read 2 bytes
• Instruction Format
– Increment PC by 2
icode:ifun m M1[PC]
Read instruction byte
rA:rB m M1[PC+1]
Read register byte
valP m PC+2
Compute next PC
valA m R[rA]
Read operand A
valB m R[rB]
Read operand B
valE m valB OP valA
Set CC
Perform ALU operation
Memory
Write
R[rB] m valE
Write back result
back
PC update
PC m valP
Update PC
Fetch
Decode
Execute
Set condition code register
– Formulate instruction execution as sequence of simple steps
– Use same general form for all instructions
Executing rmmovl
rmmovl rA,D(rB)
4 0 rA
A rB
• Fetch
– Read 6 bytes
• Decode
– Read operand registers
• Execute
– Compute effective
address
Stage Computation: rmmovl
rmmovl rA, D(rB)
D
• Memory
Fetch
– Write to memory
• Write back
– Do nothing
• PC Update
– Increment PC by 6
Decode
Execute
Memory
Write
back
PC update
icode:ifun m M1[PC]
Read instruction byte
rA:rB m M1[PC+1]
Read register byte
valC m M4[PC+2]
Read displacement D
valP m PC+6
Compute next PC
valA m R[rA]
Read operand A
valB m R[rB]
Read operand B
valE m valB + valC
Compute effective address
M4[valE] m valA
Write value to memory
PC m valP
Update PC
– Use ALU for address computation
Executing popl
popl rA
Stage Computation: popl
popl rA
b 0 rA
A 8
• Fetch
– Read from old stack
pointer
• Decode
• Write back
– Read stack pointer
– Increment stack pointer
by 4
• PC Update
– Increment PC by 2
Executing Jumps
t
thr
7 fall
fn thru:
fn
jXX Dest
Read register byte
valP m PC+2
valA m R[%esp]
Compute next PC
valB m R [%esp]
Read stack pointer
valE m valB + 4
Increment stack pointer
Memory
Write
valM m M4[valA]
R[%esp] m valE
Read from stack
back
PC update
R[rA] m valM
Write back result
PC m valP
Update PC
Decode
Execute
– Update stack pointer
– Write result to register
• Execute
Read instruction byte
rA:rB m M1[PC+1]
Fetch
• Memory
– Read 2 bytes
icode:ifun m M1[PC]
Stage Computation: Jumps
jXX Dest
Not taken
Fetch
target:
xx
Target: xx xx
• Fetch
Taken
• Memory
– Read 5 bytes
– Increment PC by 5
– Do nothing
• Write ,back
• Decode
– Do nothing
– Do nothing
• PC Update
• Execute
– Set PC to Dest if branch taken
or to incremented PC if not
branch
– Determine whether to take
branch based on jump
condition and condition
codes
Executing call
t
thr
8 fall
0 thru:
call Dest
Execute
back
PC update
• Decode
– Read stack pointer
• Execute
– Decrement stack pointer by 4
• Write back
– Update stack pointer
• PC Update
– Set PC to Dest
Read destination address
valP m PC+5
Fall through address
Bch m Cond(CC,ifun)
Take branch?
PC m Bch ? valC : valP
Update PC
Stage Computation: call
call Dest
Fetch
– Write incremented PC to
new value of stack pointer
valC m M4[PC+1]
– Compute both addresses
– Choose based on setting of condition codes and branch
condition
Dest
• Memory
Read instruction byte
Memory
Write
icode:ifun m M1[PC]
Read instruction byte
valC m M4[PC+1]
Read destination address
valP m PC+5
Compute return point
valB m R[%esp]
Read stack pointer
valE m valB + –4
Decrement stack pointer
Memory
Write
M4[valE] m valP
R[%esp] m valE
Write return value on stack
back
PC update
PC m valC
Set PC to destination
x
Target: xx xx
– Read 5 bytes
– Increment PC by 5
icode:ifun m M1[PC]
Decode
x
Return: xx xx
• Fetch
Update stack pointer
– Use ALU to increment stack pointer
– Must update two registers: Popped value, New SP
Dest
xx
Fall through: xx xx
Read stack pointer
Decode
Execute
Update stack pointer
– Use ALU to decrement stack pointer
– Store incremented PC
Executing ret
Stage Computation: ret
9 0
ret
ret
icode:ifun m M1[PC]
Read instruction byte
valA m R[%esp]
Read operand stack pointer
valB m R[%esp]
Read operand stack pointer
valE m valB + 4
Increment stack pointer
Memory
Write
valM m M4[valA]
R[%esp] m valE
Read return address
back
PC update
PC m valM
Set PC to return address
Fetch
x
Return: xx xx
Decode
• Memory
• Fetch
Execute
– Read return address from old
stack pointer
– Read 1 byte
• Decode
• Write back
– Read stack pointer
– Update stack pointer
• PC Update
• Execute
– Use ALU to increment stack pointer
– Read return address from memory
– Set PC to return address
– Increment stack pointer by 4
Computation Steps
Computation Steps
OPl rA, rB
Fetch
Decode
Execute
call Dest
icode,ifun
icode:ifun m M1[PC]
Read instruction byte
rA,rB
rA:rB m M1[PC+1]
Read register byte
valC
icode,ifun
Fetch
[Read constant word]
valA m R[rA]
Read operand A
valA, srcA
valB, srcB
valB m R[rB]
Read operand B
valE
valE m valB OP valA
Set CC
Perform ALU operation
valM
back
PC update
dstM
dstE
PC
[Memory read/write]
R[rB] m valE
Write back ALU result
[Write back memory result]
PC m valP
Update PC
– All instructions follow same general pattern
– Differ in what gets computed on each step
• Fetch
• Decode
–
–
–
–
–
–
srcA
srcB
dstE
dstM
valA
valB
Register ID A
Register ID B
Destination Register E
Destination Register M
Register value A
Register value B
– valE
– Bch
Read operand B
valE
valE m valB + –4
Perform ALU operation
Cond code
Memory
Write
valM
back
PC update
dstM
dstE
PC
[Set condition code reg.]
M4[valE] m valP
R[%esp] m valE
[Memory read/write]
[Write back ALU result]
Write back memory result
PC m valC
Update PC
newPC
New
PC
PC
valM
Key
ALU result
Branch flag
• Memory
– valM
valB m R[%esp]
SEQ Hardware
• Execute
Instruction code
Instruction function
Instr. Register A
Instr. Register B
Instruction constant
Incremented PC
[Read operand A]
valB, srcB
– All instructions follow same general pattern
– Differ in what gets computed on each step
Computed Values
icode
ifun
rA
rB
valC
valP
Compute next PC
valA, srcA
Memory
Write
–
–
–
–
–
–
Read constant word
valP m PC+5
valP
Execute
[Read register byte]
valC m M4[PC+1]
Compute next PC
Set condition code register
Read instruction byte
valC
valP m PC+2
Decode
icode:ifun m M1[PC]
rA,rB
valP
Cond code
Update stack pointer
Value from memory
data out
• Blue boxes: predesigned
hardware blocks
read
Mem.
control
Memory
write
Addr
Bch
– E.g., memories, ALU
Data
Data
memory
memory
Execute
Data
valE
ALU
fun.
ALU
ALU
CC
CC
• Gray boxes: control logic
ALU
A
ALU
B
– Describe in HCL
valA valB dstE dstM srcA srcB
• White ovals: labels for signals
• Thick lines: 32-bit word values
dstE
E dstM
M srcA
A srcB
Decode
B
Write back
• Thin lines: 4-8 bit values
• Dotted lines: 1-bit values
A
Register
RegisterM
file
file E
icode ifun
Fetch
rA
rB
Instruction
Instruction
memory
memory
PC
valC
valP
PC
PC
increment
increment
Fetch Logic
icode
ifun
rA
rB
valC
valP
Need
valC
Instr
valid
icode
Align
Align
Byte 0
ifun
–
–
–
–
• Control Logic
PC
nop
0 0
halt
1 0
rrmovl V,rB
2 0 rA
A rB
immovl V,rB
3 0 8 rB
B
D
rmmovl rA,D(rB) 4 0 rA
A rB
B
D
mrmovl D(rB),rA
5 0 rA
A rB
B
D
OP1 rA,rB
6 fn
fn rA
A rB
jXX Dest
7 fn
fn
Dest
call Dest
8 0
Dest
ret
9 0
pushl rA
A 0 rA
A 8
popl rA
B 0 rA
A 8
• Register File
– Read ports A, B
– Write ports E, M
– Addresses are register
IDs or 8 (no access)
Write-back
Read operand A
Write-back
Read stack pointer
Write-back
dstE dstM srcA
icode
rA
E
srcB
srcB
rB
R[rB] m valE
Write back result
No operand
Write-back
No operand
Write-back
Read stack pointer
Write-back
None
popl rA
R[%esp] m valE
Update stack pointer
jXX Dest
call Dest
None
call Dest
ret
valA m R[%esp]
srcA
rmmovl rA, D(rB)
jXX Dest
Decode
dstM
M
OPl rA, rB
Read operand A
popl rA
Decode
B
Register
Register
file
file
valM valE
E Destination
rmmovl rA, D(rB)
Decode
A
dstE
– srcA, srcB: read port
addresses
– dstA, dstB: write port
addresses
valB
valA
• Control Logic
OPl rA, rB
valA m R[%esp]
PC
Decode Logic
A Source
Decode
Bytes 1-5
– Instr. Valid: Is this instruction valid?
– Need regids: Does this instruction have a register bytes?
– Need valC: Does this instruction have a constant word?
bool instr_valid = icode in
{ INOP, IHALT, IRRMOVL, IIRMOVL, IRMMOVL, IMRMOVL,
IOPL, IJXX, ICALL, IRET, IPUSHL, IPOPL };
valA m R[rA]
PC
PC
increment
increment
Instruction
Instruction
memory
memory
bool need_regids =
icode in { IRRMOVL, IOPL, IPUSHL, IPOPL,
IIRMOVL, IRMMOVL, IMRMOVL };
Decode
valP
Align
Align
Byte 0
Fetch Control Logic
valA m R[rA]
valC
Need
regids
Split
Split
Bytes 1-5
PC: Register containing PC
Instruction memory: Read 6 bytes (PC to PC+5)
Split: Divide instruction byte into icode and ifun
Align: Get fields for rA, rB, and valC
Decode
rB
Need
valC
Instruction
Instruction
memory
memory
• Predefined Blocks
rA
Instr
valid
PC
PC
increment
increment
Need
regids
Split
Split
Fetch Logic
R[%esp] m valE
Update stack pointer
ret
int srcA = [
icode in { IRRMOVL, IRMMOVL, IOPL, IPUSHL
icode in { IPOPL, IRET } : RESP;
1 : RNONE; # Don't need register
];
} : rA;
R[%esp] m valE
Update stack pointer
int dstE = [
icode in { IRRMOVL, IIRMOVL, IOPL} : rB;
icode in { IPUSHL, IPOPL, ICALL, IRET } : RESP;
1 : RNONE; # Don't need register
];
ALU A Input
Execute Logic
OPl rA, rB
valE m valB OP valA
Execute
• Units
Perform ALU operation
rmmovl rA, D(rB)
– ALU
• Implements 4 required functions
• Generates condition code values
– CC
Bch
ALU
fun.
ALU
ALU
CC
CC
– bcond
• Computes branch flag
Set
CC
ALU
A
call Dest
icode ifun
valC
valA
valE m valB OP valA
valE m valB + valC
Memory Logic
Perform ALU operation
• Memory
Compute effective address
Increment stack pointer
No operation
call Dest
valE m valB + –4
Decrement stack pointer
ret
valE m valB + 4
Increment stack pointer
int alufun = [
icode == IOPL : ifun;
1 : ALUADD;
];
– Reads or writes memory
word
Write value to memory
Memory
Read from stack
Memory
No operation
Memory
M4[valE] m valP
Write return value on stack
Memory
valM m M4[valA]
valE
valA valP
No operation
Read return address
Memory
M4[valE] m valA
Write value to memory
valM m M4[valA]
Read from stack
No operation
call Dest
ret
Memory
icode
Mem
data
jXX Dest
call Dest
Memory
data in
Mem
addr
popl rA
jXX Dest
Memory
write
rmmovl rA, D(rB)
popl rA
valM m M4[valA]
Data
Data
memory
memory
OPl rA, rB
Memory
rmmovl rA, D(rB)
Memory
Mem.
write
read
Memory Read
No operation
M4[valE] m valA
data out
Mem.
read
– Mem. read: should word
be read?
– Mem. write: should word
be written?
– Mem. addr.: Select
address
– Mem. data.: Select data
OPl rA, rB
Memory
valM
• Control Logic
Memory Address
Memory
Increment stack pointer
int aluA = [
icode in { IRRMOVL, IOPL } : valA;
icode in { IIRMOVL, IRMMOVL, IMRMOVL } : valC;
icode in { ICALL, IPUSHL } : -4;
icode in { IRET, IPOPL } : 4;
]; # Other instructions don't need ALU
jXX Dest
Execute
valE m valB + 4
Execute
popl rA
valE m valB + 4
Decrement stack pointer
ret
valB
rmmovl rA, D(rB)
Execute
valE m valB + –4
Execute
OPl rA, rB
Execute
No operation
ALU
B
ALU Operation
Execute
Increment stack pointer
jXX Dest
Execute
• Control Logic
Execute
valE m valB + 4
Execute
bcond
bcond
Execute
Compute effective address
popl rA
• Register with 3 condition code bits
– Set CC: Should condition code
register be loaded?
– ALU A: Input A to ALU
– ALU B: Input B to ALU
– ALU fun: What function should
ALU compute?
valE m valB + valC
Execute
valE
M4[valE] m valP
Write return value on stack
ret
int mem_addr = [
icode in { IRMMOVL, IPUSHL, ICALL, IMRMOVL } : valE;
icode in { IPOPL, IRET } : valA;
# Other instructions don't need address
];
valM m M4[valA]
Read return address
bool mem_read = icode in { IMRMOVL, IPOPL, IRET };
PC Update
PC Update Logic
OPl rA, rB
PC update
• New PC
PC m valP
Update PC
rmmovl rA, D(rB)
– Select next value of PC
PC update
PC
PC m valP
Update PC
popl rA
PC update
New
PC
PC m valP
Update PC
jXX Dest
icode
Bch
valC
valM
valP
PC update
PC m Bch ? valC : valP
Update PC
call Dest
PC update
PC m valC
Set PC to destination
ret
PC update
PC m valM
Set PC to return address
int new_pc = [
icode == ICALL : valC;
icode == IJXX && Bch : valC;
icode == IRET : valM;
1 : valP;
];
SEQ Operation
• State
Read
Combinational
Logic
– PC register
– Cond. Code register
– Data memory
– Register file
All updated as clock rises
Write
Data
Data
memory
memory
CC
CC
Read
Ports
• Combinational Logic
Write
Ports
– ALU
– Control logic
– Memory reads
Register
Register
file
file
SEQ
Operation
#2
CC
CC
100
100
Cycle 2
Cycle 3
Cycle 4
Cycle 1:
0x000:
irmovl $0x100,%ebx
# %ebx <-- 0x100
Cycle 2:
0x006:
irmovl $0x200,%edx
# %edx <-- 0x200
Cycle 3:
0x00c:
addl %edx,%ebx
# %ebx <-- 0x300 CC <-- 000
Cycle 4:
0x00e:
je dest
# Not taken
Read
Combinational
Logic
Write
– state set according to
second irmovl
instruction
– combinational logic
starting to react to
state changes
Data
Data
memory
memory
Read
Ports
Write
Ports
Register
Register
file
file
• Instruction memory
• Register file
• Data memory
PC
0x00c
Cycle 1
Clock
%ebx
%ebx == 0x100
0x100
PC
0x00c
SEQ
Operation
#3
Cycle 1
000
Cycle 3
Cycle 4
Cycle 1:
0x000:
irmovl $0x100,%ebx
# %ebx <-- 0x100
Cycle 2:
0x006:
irmovl $0x200,%edx
# %edx <-- 0x200
Cycle 3:
0x00c:
addl %edx,%ebx
# %ebx <-- 0x300 CC <-- 000
Cycle 4:
0x00e:
je dest
# Not taken
Read
Combinational
nal
Logic
CC
CC
100
100
Cycle 2
Clock
Write
Data
Data
memory
memory
Read
Ports
Write
Ports
Register
Register
file
file
%ebx
%ebx == 0x100
0x100
– state set according to
second irmovl
instruction
– combinational logic
generates results for
addl instruction
SEQ
Operation
#4
CC
CC
000
000
Cycle 3
Cycle 4
Cycle 1:
0x000:
irmovl $0x100,%ebx
# %ebx <-- 0x100
0x006:
irmovl $0x200,%edx
# %edx <-- 0x200
Cycle 3:
0x00c:
addl %edx,%ebx
# %ebx <-- 0x300 CC <-- 000
Cycle 4:
0x00e:
je dest
# Not taken
Write
Data
Data
memory
memory
Read
Ports
Write
Ports
Register
Register
file
file
%ebx
%ebx == 0x300
0x300
PC
0x00e
Cycle 2
Cycle 2:
Read
Combinational
Logic
0x00e
PC
0x00c
Cycle 1
Clock
– state set
according to
addl instruction
– combinational
logic starting to
react to state
changes
SEQ
Operation
#5
Cycle 1
Cycle 4
Cycle 1:
0x000:
irmovl $0x100,%ebx
# %ebx <-- 0x100
0x006:
irmovl $0x200,%edx
# %edx <-- 0x200
Cycle 3:
0x00c:
addl %edx,%ebx
# %ebx <-- 0x300 CC <-- 000
Cycle 4:
0x00e:
je dest
# Not taken
Write
Data
Data
memory
memory
Read
Ports
Write
Ports
Register
Register
file
file
%ebx
%ebx == 0x300
0x300
0x013
PC
0x00e
Cycle 3
Cycle 2:
Read
Combinational
nal
Logic
CC
CC
000
000
Cycle 2
Clock
– state set
according to
addl instruction
– combinational
logic generates
results for je
instruction
SEQ Summary
• Implementation
– Express every instruction as series of simple steps
– Follow same general flow for each instruction type
– Assemble registers, memories, predesigned combinational
blocks
– Connect with control logic
• Limitations
– Too slow to be practical
– In one cycle, must propagate through instruction memory,
register file, ALU, and data memory
– Would need to run clock very slowly
– Hardware units only active for fraction of clock cycle
Download