COSC 6385 Computer Architecture - Tomasulo`s Algorithm (II) Data

advertisement
COSC 6385
Computer Architecture
- Tomasulo’s Algorithm (II)
Edgar Gabriel
Spring 2012
Edgar Gabriel
Data fields for reservation stations
Op:operation to perform on source operands S1 and S2
Qj, Qk: reservation stations producing the operands
Vj, Vk: value for each operand
A: holds information for memory address calculation
(immediate field, effective address)
• Busy: indicates occupied functional units/reservation
stations
•
•
•
•
• Qi: number of the reservation station who will produce
the data to be stored in this register
COSC 6385 – Computer Architecture
Edgar Gabriel
1
Detailed steps
• Lets look at the details for an operation
OP
rd, rs, rt
(e.g. ADD.D F6, F2, F0)
• Assume, that
– Operation has been assigned to reservation station r
– RS[r] is the data structure holding all the fields for
reservation station r, as described in the last lecture
– RegisterStat[rs] is the data structure holding the
status of register rs (e.g. whether a reservation station
will write the register)
– Regs[rs] is the register rs in the register file
COSC 6385 – Computer Architecture
Edgar Gabriel
Detailed steps (II)
Instruction state
Issue
Wait until
Station r empty
FP operation
Action / bookkeeping
if ( RegisterStat[rs].Qi != 0 ){
RS[r].Qj = RegisterStat[rs].Qi;
}
else {
RS[r].Qj = 0;
RS[r].Vj = Regs[rs];
}
if ( RegisterStat[rt].Qi != 0 ){
RS[r].Qk = RegisterStat[rt].Qi;
}
else {
RS[r].Qk = 0;
RS[r].Vk = Regs[rt];
}
RS[r].Busy = yes;
RegisterStat[rd].Qi = r;
COSC 6385 – Computer Architecture
Edgar Gabriel
2
Detailed steps (III)
Instruction state
Execute
FP operation
Write result
FP operation
Wait until
RS[r].Qj==0 &&
RS[r].Qk==0
Action / bookkeeping
/* compute result using Vj and Vk */
∀x
Execution complete
and CDB available
: if ( RegisterStat[x].Qi == r) {
Regs[x] = result;
RegisterStat[x].Qi = 0;
}
∀x
: if ( RS[x].Qj == r ) {
RS[x].Vj = result;
RS[x].Qj = 0;
}
∀x
: if ( RS[x].Qk == r ) {
RS[x].Vk = result;
RS[x].Qk = 0;
}
RS[r].Busy = no;
COSC 6385 – Computer Architecture
Edgar Gabriel
Detailed steps (IV)
For a LOAD operation, e.g. LD rt, imm(rs)
Instruction state
Issue
Wait until
Buffer r empty
Load
Action / bookkeeping
if ( RegisterStat[rs].Qi != 0 ){
RS[r].Qj = RegisterStat[rs].Qi;
}
else {
RS[r].Qj = 0;
RS[r].Vj = Regs[rs];
}
RS[r].A = imm;
RS[r].Busy = yes;
RegisterStat[rt].Qi = r;
COSC 6385 – Computer Architecture
Edgar Gabriel
3
Detailed steps (V)
Instruction state
Execute
Load step1
Load step 2
Write result
Load
Wait until
RS[r].Qj==0 &&
r is head of
load queue
Load step 1
complete
Execution complete
and CDB available
Action / bookkeeping
RS[r].A = RS[r].Vj + RS[r].A
Read from Mem[RS[r].A]
∀x
: if ( RegisterStat[x].Qi == r) {
Regs[x] = result;
RegisterStat[x].Qi = 0;
}
∀x
: if ( RS[x].Qj == r ) {
RS[x].Vj = result;
RS[x].Qj = 0;
}
∀x :
if ( RS[x].Qk == r ) {
RS[x].Vk = result;
RS[x].Qk = 0;
}
RS[r].Busy = no;
COSC 6385 – Computer Architecture
Edgar Gabriel
Dynamic branch prediction (I)
• In Tomasulo’s algorithm, no instruction is allowed to initiate
execution until all branches preceding the instruction have
completed
• Up to now, we used four techniques to avoid branch hazards
– Stall
– Predict not taken
– Predict taken
– Delayed branch
• All methods are static -> do not take the previous behavior
of branches into account
COSC 6385 – Computer Architecture
Edgar Gabriel
4
Dynamic branch prediction (II)
• Seven techniques for dynamic branch prediction
–
–
–
–
–
–
1bit branch prediction buffer
2bit branch prediction buffer
Correlating Branch Prediction Buffer
Branch Target Buffer
(Integrated Instruction Fetch Units)
Return Address Predictors
COSC 6385 – Computer Architecture
Edgar Gabriel
1bit Branch prediction buffer (I)
• Branch prediction buffer:
– Small memory area indexed by the lower portion of the
address of the branch instruction
– Records whether the branch was taken the last time or
not (1 bit is sufficient)
• Please note:
– Several branches might share the same address since we
do not use the full branch instruction address for
accessing the branch prediction buffer
COSC 6385 – Computer Architecture
Edgar Gabriel
5
1bit Branch Prediction Buffer (II)
• Limitations
– Even for a regular loop (embedded in another large loop)
the 1bit Branch Prediction Buffer will mispredict at least
the first and the last iteration
• 1st iteration: the bit has been set by the last iteration
of the same loop to ‘not-taken’, but the branch will
be taken
• Last iteration: the bit says ‘taken’, but the branch
won’t be taken
COSC 6385 – Computer Architecture
Edgar Gabriel
2bit Branch Prediction Buffer
• A prediction must miss twice before the prediction is
changed
– Can be extended to n-bits
Taken
Predict taken
11
Not taken
Taken
Taken
Predict not taken
01
Predict taken
10
Not taken
Not taken
Taken
Predict not taken
00
COSC 6385 – Computer Architecture
Edgar Gabriel
6
Correlated branches
• For a (1,1) predictor: each branch has two different
branch prediction buffers:
Predictor used in case
the previous branch in
the application has not
been taken
Predictor used in case
the previous branch in
the application has
been taken
X / Y
• The content of the two branch prediction buffers are
determined by the branch to which they belong
• Which of the two branch prediction buffers are used is
depending on the outcome of the previous branch in
the application
COSC 6385 – Computer Architecture
Edgar Gabriel
Correlated branches - example
if ( d==0 )
d = 1;
if ( d==1 )
…
BNEZ R1, L1
DADDIU R1, R0, #1
DADDIU R3, R1, #-1
BNEZ R3, L2
L1:
!branch b1
!branch b2
…
L2:
Initial value
of d
d==0?
b1
Value of d
before b2
d==1?
b2
2
No
Taken
2
No
Taken
0
Yes
Not taken
1
Yes
Not taken
2
No
Taken
2
No
Taken
0
Yes
Not taken
1
Yes
Not taken
COSC 6385 – Computer Architecture
Edgar Gabriel
7
Correlated branches - example
d=?
BPB b1
2
NT/NT
b1 act.
BPB b2
B2 act.
NT/NT
•
the branch prediction buffers for the branches b1 and b2 are
assumed to hold the prediction ‘Not taken’ for both option (previous
branch not taken/taken)
COSC 6385 – Computer Architecture
Edgar Gabriel
Correlated branches - example
d=?
BPB b1
2
NT/NT
b1 act.
BPB b2
B2 act.
NT/NT
•
assuming BPB for b1 uses the ‘Not Taken’ predictor because the
previous branch in the application has not been taken
→ BPB for b1 predicts that b1 will not be taken
COSC 6385 – Computer Architecture
Edgar Gabriel
8
Correlated branches - example
d=?
BPB b1
b1 act.
BPB b2
2
NT/NT
T
NT/NT
B2 act.
→ BPB for b1 predicts that b1 will not be taken
→ b1 is taken (see table for d=2)
Initial value
of d
d==0?
b1
Value of d
before b2
d==1?
b2
2
No
Taken
2
No
Taken
0
Yes
Not taken
1
Yes
Not taken
COSC 6385 – Computer Architecture
Edgar Gabriel
Correlated branches - example
d=?
BPB b1
b1 act.
BPB b2
2
NT/NT
T
NT/NT
B2 act.
T/NT
→ updating the ‘Previous branch has not been taken’ part of BPB for
b1 to Taken
→ because b1 has been taken, the ‘last branch has been taken’ part
of BPB b2 will be used
→ BPB b2 predicts, that b2 will not be taken
COSC 6385 – Computer Architecture
Edgar Gabriel
9
Correlated branches - example
d=?
BPB b1
b1 act.
BPB b2
B2 act.
2
NT/NT
T
NT/NT
T
T/NT
NT/T
→ b2 is taken (see table for d=2)
→ updating the ‘Previous branch has been taken’ part of BPB for b2
to Taken
→ because b2 has been taken, the ‘last branch has been taken’ part
of BPB b1 will be used
taken
d==0?that b1 will
b1 not be
Value of d
→
BPBvalue
b1 predicts,
Initial
of d
d==1?
b2
before b2
2
No
COSC 6385 – Computer Architecture
Edgar Gabriel
0
Yes
Taken
2
No
Taken
Not taken
1
Yes
Not taken
Correlated branches - example
d=?
BPB b1
b1 act.
BPB b2
B2 act.
2
NT/NT
T
NT/NT
T
0
T/NT
NT
NT/T
→ b1 is not taken (see table for d=0) → matches prediction!
→ update of BPB b1 does not modify any entry taken
→ because b1 has not been taken, the ‘last branch has not been
taken’ part of BPB b2 will be used
→ BPB b2 predicts that b2 will not be taken
Initial value
of d
d==0?
b1
Value of d
before b2
d==1?
b2
2
No
Taken
2
No
Taken
Not taken
1
Yes
Not taken
COSC 6385 – Computer Architecture
Edgar Gabriel
0
Yes
10
Correlated branches
• A (2,1) correlated branch predictor
– Uses the behavior of the last 2 branches to choose from
22 different predictions
– Uses a 1 bit predictor for each of the 4 prediction buffers
Predictor used in
case the previous
2 branches in the
application have
both not been
taken (00)
Predictor used in
case the previous
branches have the
history :second
last branch not
taken, last branch
taken (01)
Predictor used in
case the previous
branches have the
history: second last
branch taken, last
branch not taken
(10)
Predictor used
in case the
previous 2
branches in the
application
have both been
taken (11)
A / B / C / D
COSC 6385 – Computer Architecture
Edgar Gabriel
Correlated branches
• How do we know which of the four sections of our
branch predictor to use
– Need to record the behavior of all branches in the
application
Initial value
of d
d==0?
b1
Value of d
before b2
d==1?
b2
2
No
Taken
2
No
Taken
0
Yes
Not taken
1
Yes
Not taken
2
No
Taken
2
No
Taken
0
Yes
Not taken
1
Yes
Not taken
• e.g.
11001100110011
COSC 6385 – Computer Architecture
Edgar Gabriel
11
Correlated branches
• For a (2,n) branch predictor, the last two branches are
relevant
2-bit global branch history
11
(implemented using a 2bit shift register)
110
1100
11001
110011
1100110
11001100
COSC 6385 – Computer Architecture
Edgar Gabriel
Correlated Branches
Idea: taken/not taken of
recently executed branches
is related to behavior of
next branch (as well as the
history of that branch
behavior)
– Then behavior of recent
branches selects
between, say, 4
predictions of next
branch, updating just
that prediction
• (2,2) predictor: 2-bit global,
2-bit local
COSC 6385 – Computer Architecture
Edgar Gabriel
Branch address (4 bits)
2-bits per branch
local predictors
Prediction
2-bit global
branch history
(01 = not taken then taken)
Slide based on a lecture by David A. Patterson,
University of California, Berkley
http://www.cs.berkeley.edu/~pattrsn/252S01
12
Accuracy of Different Schemes
20%
18%
4096 Entries 2-bit BHT
Unlimited Entries 2-bit BHT
1024 Entries (2,2) BHT
18%
16%
14%
Frequency of
Mispredictions
12%
11%
10%
8%
6%
6%
6%
6%
5%
5%
4%
4%
2%
0%
1%
1%
0%
0%
4,096 entries: 2-bits per entry
Unlimited entries: 2-bits/entry
1,024 entries (2,2)
Slide based on a lecture by David A. Patterson,
University of California, Berkley
http://www.cs.berkeley.edu/~pattrsn/252S01
COSC 6385 – Computer Architecture
Edgar Gabriel
Branch Target Buffers
• Branch Target Buffer (BTB): Address of branch index to get
prediction AND branch address (if taken)
PC of instruction
FETCH
Branch PC
=?
No: branch not
predicted, proceed normally
(Next PC = PC+4)
COSC 6385 – Computer Architecture
Edgar Gabriel
Predicted PC
Yes: instruction
is branch and
use predicted
PC as next PC
Extra
prediction state
bits
Slide based on a lecture by David A. Patterson,
University of California, Berkley
http://www.cs.berkeley.edu/~pattrsn/252S01
13
Need Address
at Same Time as Prediction (II)
Send PC to memory and
branch target buffer (BTB)
No
Yes
Entry found in
BTB?
Send out
predicted PC
Is instruction a
taken branch?
Yes
No
Normal
execution
Enter branch
address and
next PC count
into BTB
No
Yes
Taken branch?
Mispredicted
branch, kill
fetched
instruction
Branch
correctly
predicted
COSC 6385 – Computer Architecture
Edgar Gabriel
Special Case Return Addresses
• Register Indirect branch hard to predict address
• SPEC89 85% such branches for procedure return
• Save return address in small buffer that acts like a
stack: 8 to 16 entries has small miss rate
COSC 6385 – Computer Architecture
Edgar Gabriel
Slide based on a lecture by David A. Patterson,
University of California, Berkley
http://www.cs.berkeley.edu/~pattrsn/252S01
14
Hardware based speculation
• Branch prediction reduces direct stalls of branches
• Instructions can be issued using dynamic branch
prediction, but could not be executed until the branch
outcome was known
• Speculative executions extends the concept of dynamic
scheduling
– Speculates on the outcome of the branch
– Executes the following instructions
• Requires the ability to undo instructions in case the
prediction was wrong.
COSC 6385 – Computer Architecture
Edgar Gabriel
Hardware based speculation (II)
• Extending Tomasulo’s algorithm to support speculation:
– Separate the step of bypassing results among instructions
from the completion of the instruction
– Add another step
• Issue
• Execute
• Write result
• Commit
– Instruction execute out-of-order but commit in-order
– Additional set of hardware buffers to hold the results of
instructions which have not yet been committed: Reorder
buffer (ROB)
COSC 6385 – Computer Architecture
Edgar Gabriel
15
Reorder Buffers
• Hold the results of instructions between the time an
instruction finishes and the time the instruction is
being committed
• Acts as additional reservation stations
– ROB can be the source of operands of other instructions
• Each ROB contains four fields
– Instruction type: branch/store/ALU operation
– Destination: Register number or memory address where
result should be written
– Value: value of the instruction
– Ready: instruction completed execution?
COSC 6385 – Computer Architecture
Edgar Gabriel
Four steps of execution (I)
• Issue:
– Get instruction from instruction queue
– Issue instruction if an reservation station is empty and an
ROB is available
• Execute:
– If operands available, execute
– New: a store instruction only contains the calculation of
the effective address at this point
• Write result:
– Write result to CDB
– Any reservation station/ROB should update
– Register file not modified at this point
COSC 6385 – Computer Architecture
Edgar Gabriel
16
Four steps of execution (II)
• Commit:
– Normal case (prediction was correct):
• instruction reaches head of ROB
• Update register file
• Remove entry from ROB
– Store operation:
• Instruction reaches head of ROB
• Update of memory location
– Incorrect prediction:
• When a branch instruction reaches head of ROB and
the hardware indicates that the prediction was wrong,
ROB is flushed and execution restarted.
COSC 6385 – Computer Architecture
Edgar Gabriel
The same example as for
scoreboarding
L.D
F6, 34(R2)
L.D
F2, 45(R3)
MUL.D F0, F2, F4
SUB.D F8, F6, F2
DIV.D F10, F0, F6
ADD.D F6, F8, F2
Following slides are based on a lecture by Jelena Mirkovic,
University of Delaware
http://www.cis.udel.edu/~sunshine/courses/F04/CIS662/class12.pdf
Assumption:
ADD and SUB take 2 clock cycles
MULT takes 10 clock cycle
DIV takes 40 clock cycles
2 Load/Store, 3 ADD and 2 Mult reservation stations
COSC 6385 – Computer Architecture
Edgar Gabriel
17
Time=1
Issue first load
Instruction status
Instruction
Issue
L.D
F6, 34(R2)
L.D
F2, 45(R3)
Execute
Write result
Commit
MUL.D F0, F2, F4
SUB.D F8, F6, F2
DIV.D F10, F0, F6
ADD.D F6, F8, F2
Reservation station
Name
Load1
Busy
Yes
Op
Vj
Vk
Load
Qj
Qk
Regs[R2]
Dest
A
#1
34
F12
3
Load2
Add1
Add2
Add3
Mult1
Mult2
Register result status
F0
F2
F4
F6
Reorder#
COSC 6385 – Computer Architecture
Busy Edgar Gabriel
Time=1
F8
F10
F30
#1
yes
Issue first load
Reorder buffer
Entry
Busy
Instruction
State
Destination
1
Yes
L.D F6, 34(R2)
Issue
F6
Value
2
3
4
5
6
COSC 6385 – Computer Architecture
Edgar Gabriel
18
Time=2
first load executes, Second load issues
Instruction status
Instruction
Issue
Execute
L.D
F6, 34(R2)
L.D
F2, 45(R3)
Write result
Commit
MUL.D F0, F2, F4
SUB.D F8, F6, F2
DIV.D F10, F0, F6
ADD.D F6, F8, F2
Reservation station
Name
Busy
Op
Vj
Vk
Qj
Qk
Dest
A
Load1
Yes
Load
Regs[R2]
#1
+34
Load2
Yes
Load
Regs[R3]
#2
45
F12
3
Add1
Add2
Add3
Mult1
Mult2
Register result status
F0
F2
Reorder#
F4
#2
COSC 6385 – Computer Architecture
Busy Edgar Gabriel
yes
F6
F8
F10
F30
#1
yes
Time=2
Reorder buffer
Entry
Busy
Instruction
State
Destination
1
Yes
L.D F6, 34(R2)
Execute
F6
2
Yes
L.D F2, 45(R3)
Issue
F2
Value
3
4
5
6
COSC 6385 – Computer Architecture
Edgar Gabriel
19
Time=3
first load executes, Second load executes, Mul is issued
Instruction status
Instruction
Issue
Execute
L.D
F6, 34(R2)
L.D
F2, 45(R3)
MUL.D F0, F2, F4
Write result
Commit
SUB.D F8, F6, F2
DIV.D F10, F0, F6
ADD.D F6, F8, F2
Reservation station
Name
Busy
Op
Vj
Vk
Load1
Yes
Load
Load2
Yes
Load
Regs[R3]
Yes
Mult
Regs[F4]
Qj
Qk
Dest
A
#1
Regs[R2]+34
#2
+45
Add1
Add2
Add3
Mult1
#2
#3
Mult2
Register result status
Reorder#
F0
F2
#3
#2
F4
COSC 6385 – Computer Architecture
Busy Edgar Gabriel
yes
yes
F6
F8
F10
F12
3
F30
#1
yes
Time=3
Reorder buffer
Entry
Busy
Instruction
State
Destination
1
Yes
L.D F6, 34(R2)
Execute
F6
2
Yes
L.D F2, 45(R3)
Executes
F2
3
Yes
MUL.D F0,F2,F4
Issue
F0
Value
4
5
6
COSC 6385 – Computer Architecture
Edgar Gabriel
20
Time=4 first load write res., Second load executes, Mul stalled, SUB issued
Instruction status
Instruction
Issue
Execute
L.D
F6, 34(R2)
L.D
F2, 45(R3)
MUL.D F0, F2, F4
SUB.D F8, F6, F2
Write result
Commit
DIV.D F10, F0, F6
ADD.D F6, F8, F2
Reservation station
Name
Busy
Op
Vj
Vk
Qj
Qk
Dest
A
Load1
Load2
Yes
Load
Add1
Yes
Sub
Mem[34+Regs[R2]]
#2
#4
#2
Yes
Mult
Regs[F4]
#2
#3
Regs[R3]+45
Add2
Add3
Mult1
Mult2
Register result status
Reorder#
F0
F2
#3
#2
F4
COSC 6385 – Computer Architecture
Busy Edgar Gabriel
yes
yes
F6
F8
#1
#4
yes
yes
F10
F12
3
F30
Time=4
Reorder buffer
Entry
Busy
Instruction
State
Destination
Value
1
Yes
L.D F6, 34(R2)
Write result
F6
Mem[34+Regs[R2]]
2
Yes
L.D F2, 45(R3)
Executes
F2
3
Yes
MUL.D F0,F2,F4
Stalled in issue
F0
4
Yes
SUB.D F8, F2, F6
Issue
F8
5
6
COSC 6385 – Computer Architecture
Edgar Gabriel
21
Time=5first load commits, Second load write res, Mul, Sub stalled, Div issued
Instruction status
Instruction
Issue
Execute
Write result
L.D
F6, 34(R2)
L.D
F2, 45(R3)
MUL.D F0, F2, F4
SUB.D F8, F6, F2
DIV.D F10, F0, F6
Commit
ADD.D F6, F8, F2
Reservation station
Name
Busy
Op
Vj
Vk
Qj
Qk
Dest
A
Load1
Load2
Add1
Yes
Sub
Mem[45+Regs[R3]]
Mem[34+Regs[R2]]
Mult1
Yes
Mult
Mem[45+Regs[R3]]
Regs[F4]
Mult2
Yes
Div
#4
Add2
Add3
#3
Mem[34+Regs[R2]]
#3
#5
Register result status
Reorder#
F0
F2
#3
#2
F4
F6
COSC 6385 – Computer Architecture
Busy Edgar Gabriel
yes
yes
F8
F10
#4
#5
yes
Yes
F12
3
F30
Time=5
Reorder buffer
Entry
Busy
Instruction
State
Destination
Value
1
no
L.D F6, 34(R2)
Commit
F6
Mem[34+Regs[R2]]
2
Yes
L.D F2, 45(R3)
Write result
F2
Mem[45+Regs[R3]]
3
Yes
MUL.D F0,F2,F4
Stalled in issue
F0
4
Yes
SUB.D F8, F2, F6
Stalled in issue
F8
5
Yes
DIV.D F10,F0, F6
Issue
F10
6
COSC 6385 – Computer Architecture
Edgar Gabriel
22
Time=6 second load commits., Mul (1/10), Sub (1/2), Div stalled, Add issued
Instruction status
Instruction
Issue
Execute
Write result
Commit
L.D
F6, 34(R2)
L.D
F2, 45(R3)
MUL.D F0, F2, F4
SUB.D F8, F6, F2
DIV.D F10, F0, F6
ADD.D F6, F8, F2
Reservation station
Name
Busy
Op
Vj
Vk
Qj
Qk
Dest
A
Load1
Load2
Add1
Yes
Sub
Add2
yes
Add
Mult1
Yes
Mult
Mult2
Yes
Div
Mem[45+Regs[R3]]
Mem[34+Regs[R2]]
Mem[45+Regs[R3]]
#4
#4
#6
Add3
Mem[45+Regs[R3]]
Regs[F4]
#3
Mem[34+Regs[R2]]
#3
#5
Register result status
F0
Reorder#
F2
F4
#3
COSC 6385 – Computer Architecture
Busy Edgar Gabriel
yes
F6
F8
F10
#6
#4
#5
yes
yes
Yes
F12
3
F30
Time=6
Reorder buffer
Entry
Busy
Instruction
State
Destination
Value
1
no
L.D F6, 34(R2)
Commit
F6
Mem[34+Regs[R2]]
2
no
L.D F2, 45(R3)
Commit
F2
Mem[45+Regs[R3]]
3
Yes
MUL.D F0,F2,F4
Execute
F0
4
Yes
SUB.D F8, F2, F6
Execute
F8
5
Yes
DIV.D F10,F0, F6
Stalled in Issue
F10
6
Yes
ADD F6, F8, F2
Issue
F6
COSC 6385 – Computer Architecture
Edgar Gabriel
23
Time=7
Mul (2/10), Sub (2/2), Div stalled, Add stalled
Instruction status
Instruction
Issue
Execute
Write result
Commit
L.D
F6, 34(R2)
L.D
F2, 45(R3)
MUL.D F0, F2, F4
SUB.D F8, F6, F2
DIV.D F10, F0, F6
ADD.D F6, F8, F2
Reservation station
Name
Busy
Op
Vj
Vk
Qj
Qk
Dest
A
Load1
Load2
Add1
Yes
Sub
Add2
yes
Add
Mult1
Yes
Mult
Mult2
Yes
Div
Mem[45+Regs[R3]]
Mem[34+Regs[R2]]
Mem[45+Regs[R3]]
#4
#4
#6
Add3
Mem[45+Regs[R3]]
Regs[F4]
#3
Mem[34+Regs[R2]]
#3
#5
Register result status
F0
Reorder#
F2
F4
#3
COSC 6385 – Computer Architecture
Busy Edgar Gabriel
yes
F6
F8
F10
#6
#4
#5
yes
yes
Yes
F12
3
F30
Time=7
Reorder buffer
Entry
Busy
Instruction
State
Destination
Value
1
no
L.D F6, 34(R2)
Commit
F6
Mem[34+Regs[R2]]
2
no
L.D F2, 45(R3)
Commit
F2
Mem[45+Regs[R3]]
3
Yes
MUL.D F0,F2,F4
Execute
F0
4
Yes
SUB.D F8, F2, F6
Execute
F8
5
Yes
DIV.D F10,F0, F6
Stalled in Issue
F10
6
Yes
ADD F6, F8, F2
Stalled in Issue
F6
COSC 6385 – Computer Architecture
Edgar Gabriel
24
Time=8
Mul (3/10), Sub write result, Div stalled, Add stalled
Instruction status
Instruction
Issue
Execute
Write result
Commit
L.D
F6, 34(R2)
L.D
F2, 45(R3)
MUL.D F0, F2, F4
SUB.D F8, F6, F2
DIV.D F10, F0, F6
ADD.D F6, F8, F2
Reservation station
Name
Busy
Op
Vj
Vk
Qj
Qk
Dest
A
Load1
Load2
Add1
Add2
yes
Add
X
Mem[45+Regs[R3]]
Mult1
Yes
Mult
Mem[45+Regs[R3]]
Regs[F4]
Mult2
Yes
Div
#6
Add3
#3
Mem[34+Regs[R2]]
#3
#5
Register result status
F0
Reorder#
F2
F4
#3
COSC 6385 – Computer Architecture
Busy Edgar Gabriel
yes
F6
F8
F10
#6
#4
#5
yes
yes
Yes
F12
3
F30
Time=8
Reorder buffer
Entry
Busy
Instruction
State
Destination
Value
1
no
L.D F6, 34(R2)
Commit
F6
Mem[34+Regs[R2]]
2
no
L.D F2, 45(R3)
Commit
F2
Mem[45+Regs[R3]]
3
Yes
MUL.D F0,F2,F4
Execute
F0
4
Yes
SUB.D F8, F2, F6
Write result
F8
5
Yes
DIV.D F10,F0, F6
Stalled in Issue
F10
6
Yes
ADD F6, F8, F2
Stalled in Issue
F6
X
COSC 6385 – Computer Architecture
Edgar Gabriel
25
Time=9
Mul (4/10),Div stalled, Add executes (1/2)
Instruction status
Instruction
Issue
Execute
Write result
Commit
L.D
F6, 34(R2)
L.D
F2, 45(R3)
MUL.D F0, F2, F4
SUB.D F8, F6, F2
DIV.D F10, F0, F6
ADD.D F6, F8, F2
Reservation station
Name
Busy
Op
Vj
Vk
Qj
Qk
Dest
A
Load1
Load2
Add1
Add2
yes
Add
X
Mem[45+Regs[R3]]
Mult1
Yes
Mult
Mem[45+Regs[R3]]
Regs[F4]
Mult2
Yes
Div
#6
Add3
#3
Mem[34+Regs[R2]]
#3
#5
Register result status
F0
Reorder#
F2
F4
#3
COSC 6385 – Computer Architecture
Busy Edgar Gabriel
yes
F6
F8
F10
#6
#4
#5
yes
yes
Yes
F12
3
F30
Time=9
Reorder buffer
Entry
Busy
Instruction
State
Destination
Value
1
no
L.D F6, 34(R2)
Commit
F6
Mem[34+Regs[R2]]
2
no
L.D F2, 45(R3)
Commit
F2
Mem[45+Regs[R3]]
3
Yes
MUL.D F0,F2,F4
Execute
F0
4
Yes
SUB.D F8, F2, F6
Waiting to commit
F8
5
Yes
DIV.D F10,F0, F6
Stalled in Issue
F10
6
Yes
ADD F6, F8, F2
Execute
F6
X
COSC 6385 – Computer Architecture
Edgar Gabriel
26
Time=11
Mul (6/10),Div stalled, Add writes result
Instruction status
Instruction
Issue
Execute
Write result
Commit
L.D
F6, 34(R2)
L.D
F2, 45(R3)
MUL.D F0, F2, F4
SUB.D F8, F6, F2
DIV.D F10, F0, F6
ADD.D F6, F8, F2
Reservation station
Name
Busy
Op
Vj
Vk
Qj
Qk
Dest
A
Load1
Load2
Add1
Add2
Add3
Mult1
Yes
Mult
Mult2
Yes
Div
Mem[45+Regs[R3]]
Regs[F4]
#3
Mem[34+Regs[R2]]
#3
#5
Register result status
F0
Reorder#
F2
F4
#3
COSC 6385 – Computer Architecture
Busy Edgar Gabriel
yes
F6
F8
F10
#6
#4
#5
yes
yes
Yes
F12
3
F30
Time=11
Reorder buffer
Entry
Busy
Instruction
State
Destination
Value
1
no
L.D F6, 34(R2)
Commit
F6
Mem[34+Regs[R2]]
2
no
L.D F2, 45(R3)
Commit
F2
Mem[45+Regs[R3]]
3
Yes
MUL.D F0,F2,F4
Execute
F0
4
Yes
SUB.D F8, F2, F6
Waiting to commit
F8
5
Yes
DIV.D F10,F0, F6
Stalled in Issue
F10
6
Yes
ADD F6, F8, F2
Write result
F6
X
Y
COSC 6385 – Computer Architecture
Edgar Gabriel
27
Time=12
Mul (7/10),Div stalled,
Instruction status
Instruction
Issue
Execute
Write result
Commit
L.D
F6, 34(R2)
L.D
F2, 45(R3)
MUL.D F0, F2, F4
SUB.D F8, F6, F2
DIV.D F10, F0, F6
ADD.D F6, F8, F2
Reservation station
Name
Busy
Op
Vj
Vk
Qj
Qk
Dest
A
Load1
Load2
Add1
Add2
Add3
Mult1
Yes
Mult
Mult2
Yes
Div
Mem[45+Regs[R3]]
Regs[F4]
#3
Mem[34+Regs[R2]]
#3
#5
Register result status
F0
Reorder#
F2
F4
#3
COSC 6385 – Computer Architecture
Busy Edgar Gabriel
yes
F6
F8
F10
#6
#4
#5
yes
yes
Yes
F12
3
F30
Time=12
Reorder buffer
Entry
Busy
Instruction
State
Destination
Value
1
no
L.D F6, 34(R2)
Commit
F6
Mem[34+Regs[R2]]
2
no
L.D F2, 45(R3)
Commit
F2
Mem[45+Regs[R3]]
3
Yes
MUL.D F0,F2,F4
Execute
F0
4
Yes
SUB.D F8, F2, F6
Waiting to commit
F8
5
Yes
DIV.D F10,F0, F6
Stalled in Issue
F10
6
Yes
ADD F6, F8, F2
Waiting to commit
F6
X
Y
COSC 6385 – Computer Architecture
Edgar Gabriel
28
Time=16
Mul writes result, Div stalled
Instruction status
Instruction
Issue
Execute
Write result
Commit
L.D
F6, 34(R2)
L.D
F2, 45(R3)
MUL.D F0, F2, F4
SUB.D F8, F6, F2
DIV.D F10, F0, F6
ADD.D F6, F8, F2
Reservation station
Name
Busy
Op
Vj
Vk
Qj
Qk
Dest
A
Load1
Load2
Add1
Add2
Add3
Mult1
Mult2
Yes
Div
Z
Mem[34+Regs[R2]]
#5
Register result status
F0
Reorder#
F2
F4
#3
COSC 6385 – Computer Architecture
Busy Edgar Gabriel
yes
F6
F8
F10
#6
#4
#5
yes
yes
Yes
F12
3
F30
Time=16
Reorder buffer
Entry
Busy
Instruction
State
Destination
Value
1
no
L.D F6, 34(R2)
Commit
F6
Mem[34+Regs[R2]]
2
no
L.D F2, 45(R3)
Commit
F2
Mem[45+Regs[R3]]
3
Yes
MUL.D F0,F2,F4
Writing result
F0
Z
4
Yes
SUB.D F8, F2, F6
Waiting to commit
F8
X
5
Yes
DIV.D F10,F0, F6
Stalled in Issue
F10
6
Yes
ADD F6, F8, F2
Waiting to commit
F6
Y
COSC 6385 – Computer Architecture
Edgar Gabriel
29
Mul commits, Div executes (1/40),
Time=17
Instruction status
Instruction
Issue
Execute
Write result
Commit
L.D
F6, 34(R2)
L.D
F2, 45(R3)
MUL.D F0, F2, F4
SUB.D F8, F6, F2
DIV.D F10, F0, F6
ADD.D F6, F8, F2
Reservation station
Name
Busy
Op
Vj
Vk
Qj
Qk
Dest
A
Load1
Load2
Add1
Add2
Add3
Mult1
Mult2
Yes
Div
Z
Mem[34+Regs[R2]]
#5
Register result status
F0
F2
F4
Reorder#
COSC 6385 – Computer Architecture
Busy Edgar Gabriel
F6
F8
F10
#6
#4
#5
yes
yes
Yes
F12
3
F30
Time=17
Reorder buffer
Entry
Busy
Instruction
State
Destination
Value
1
no
L.D F6, 34(R2)
Commit
F6
Mem[34+Regs[R2]]
2
no
L.D F2, 45(R3)
Commit
F2
Mem[45+Regs[R3]]
3
no
MUL.D F0,F2,F4
Commits
F0
Z
4
Yes
SUB.D F8, F2, F6
Waiting to commit
F8
X
5
Yes
DIV.D F10,F0, F6
Executes
F10
6
Yes
ADD F6, F8, F2
Waiting to commit
F6
Y
COSC 6385 – Computer Architecture
Edgar Gabriel
30
Time=18
Sub commits, Div executes (2/40),
Instruction status
Instruction
Issue
Execute
Write result
Commit
L.D
F6, 34(R2)
L.D
F2, 45(R3)
MUL.D F0, F2, F4
SUB.D F8, F6, F2
DIV.D F10, F0, F6
ADD.D F6, F8, F2
Reservation station
Name
Busy
Op
Vj
Vk
Qj
Qk
Dest
A
Load1
Load2
Add1
Add2
Add3
Mult1
Mult2
Yes
Div
Z
Mem[34+Regs[R2]]
#5
Register result status
F0
F2
F4
Reorder#
COSC 6385 – Computer Architecture
Busy Edgar Gabriel
F6
F8
F10
#6
#5
yes
Yes
F12
3
F30
Time=18
Reorder buffer
Entry
Busy
Instruction
State
Destination
Value
1
no
L.D F6, 34(R2)
Commit
F6
Mem[34+Regs[R2]]
2
no
L.D F2, 45(R3)
Commit
F2
Mem[45+Regs[R3]]
3
no
MUL.D F0,F2,F4
Commit
F0
Z
4
No
SUB.D F8, F2, F6
Commit
F8
X
5
Yes
DIV.D F10,F0, F6
Executes
F10
6
Yes
ADD F6, F8, F2
Waiting to commit
F6
Y
COSC 6385 – Computer Architecture
Edgar Gabriel
31
… and so on…
• Time 57: DIV writes result
• Time 58: DIV commits
• Time 59: Add commits
COSC 6385 – Computer Architecture
Edgar Gabriel
32
Download