L19-DirectionPrediction

advertisement
Computer Architecture: A Constructive Approach
Branch Direction Prediction –
Six Stage Pipeline
Joel Emer
Computer Science & Artificial Intelligence Lab.
Massachusetts Institute of Technology
April 23, 2012
http://csg.csail.mit.edu/6.S078
L19-1
NA pred with decode feedback
Fetch
x
f
Reg
Read
Decode
Execute
Memory
Writeback
d
f
F
f
r
D
d
r
R
r
r
X
x
r
M
m
r
W
Next
Address
Prediction
April 23, 2012
http://csg.csail.mit.edu/6.S078
L19-2
Decode detected mispredicts
Non-branch
When nextPC != PC+4
=> use PC+4

Unconditional target known at decode
When nextPC != known target
=> use known target

Conditional branch
When nextPC != PC+4 or decoded target
=> use PC+4

Can we do better than PC+4?
April 23, 2012
http://csg.csail.mit.edu/6.S078
L19-3
Dynamic Branch Prediction
Branch direction prediction:
Learn and predict the direction a branch will go
Standard prediction principles:
Temporal correlation
The way a branch resolves may be a good predictor
of the way it will resolve at the next execution
Spatial correlation
Several branches may resolve in a highly correlated
manner (a preferred path of execution)
April 23, 2012
http://csg.csail.mit.edu/6.S078
L12-4
One-bit predictor
Predict branch will go same direction it went last time
Fetch PC
00
k
Fetch
I-Cache
BHT Index
2k-entry
BHT,
1 bits/entry
Instruction
Opcode
+
Decode
Branch?
April 23, 2012
offset
Target PC
http://csg.csail.mit.edu/6.S078
Taken/¬Taken?
L19-5
One-bit predictor
// Interface
interface DirectionPred;
method ActionValue#(Tuple2#(Bool, DirInfo))
predict(Addr addr);
method Action train(DirInfo dirInfo, Bool taken);
endinterface
// Feedback information
typedef 64 BPRows;
typedef Bit#(TLog#(BPRows)) DirLineIndex;
typedef DirLineIndex DirInfo;
April 23, 2012
http://csg.csail.mit.edu/6.S078
L19-6
One-bit predictor (continued)
module mkDirectionPredictor(DirectionPred);
RegFile#(DirLineIndex, Bool) dirArray <- mkRegFileFull();
method ActionValue#(Tuple2#(Bool, DirInfo))
predict(Addr addr);
Array of
prediction bits
DirLineIndex index = truncate(addr >> 2);
return tuple2(dirArray.sub(index), index);
Return prediction
endmethod
saved in array
method Action train(DirInfo dirInfo, Bool taken);
DirLineIndex index = dirInfo;
dirArray.upd(index, taken);
Update array
endmethod
with last actual
endmodule
behavior
When should we train?
April 23, 2012
http://csg.csail.mit.edu/6.S078
L19-7
Two-bit Predictor
Smith, 1981
How well does one-bit predictor do on short trip count loops?
• Assume 2 direction prediction bits per instruction
 On taken
On ¬taken 
1
1
Strongly taken
1
0
Weakly taken
0
1
Weakly ¬taken
0
0
Strongly ¬taken
Implement using saturating counter
April 23, 2012
http://csg.csail.mit.edu/6.S078
L19-8
Saturating Counter
typedef Bit#(2) Counter;
function Counter updateCounter(Bool dir, Counter counter);
return dir?saturatingInc(counter)
:saturatingDec(counter);
endfunction
function Counter saturatingInc(Counter counter);
let plusOne = counter + 1;
return (plusOne == 0)?counter:plusOne;
endfunction
function Counter saturatingDec(Counter counter);
return (counter == 0)?0:counter-1;
endfunction
How do we determine prediction from counter?
April 23, 2012
http://csg.csail.mit.edu/6.S078
L19-9
Two-bit predictor
Fetch PC
00
k
BHT Index
2k-entry
BHT,
1 bits/entry
Taken/¬Taken?
April 23, 2012
http://csg.csail.mit.edu/6.S078
L19-10
Two-bit predictor
typedef 64 BPRows;
typedef Bit#(TLog#(BPRows)) DirLineIndex;
// DirInfo data
typedef struct {
DirLineIndex index;
Counter counter;
} DirInfo deriving(Bits, Eq);
Feedback state
for training
module mkDirectionPredictor(DirectionPred);
// Direction predictor state
RegFile#(DirLineIndex,Counter) cntArray <- mkRegFileFull();
April 23, 2012
http://csg.csail.mit.edu/6.S078
L19-11
Two-bit predictor (continued)
method ActionValue#(Tuple2#(Bool, DirInfo))
predict(Addr addr);
DirInfo info = ?
Training
info.index = truncate(addr >> 2);
information is
info.counter = cntArray.sub(index);
index and counter
Bool taken = (truncate(counter >> 1) == 1);
return tuple2(taken, info);
endmethod
Prediction is high
bit of counter
method Action train(DirInfo info, Bool taken);
cntArray.upd(info.index,
updateCounter(taken, info.counter));
endmethod
Train by
endmodule
updating counter
April 23, 2012
http://csg.csail.mit.edu/6.S078
L19-12
Exploiting Spatial Correlation
Yeh and Patt, 1992
if (x[i] < 7) then
y += 1;
if (x[i] < 5) then
c -= 4;
If first condition false, second condition also false
Also works well for short trip count loops.
Implemented with a history register, ‘hist’, that records the
direction of the last N branches executed by the processor.
April 23, 2012
http://csg.csail.mit.edu/6.S078
L19-13
Ghist predictor
typedef 64 BPRows;
typedef Bit#(TLog#(BPRows)) DirLineIndex;
typedef Bit#(2) Counter;
// DirInfo data
typedef struct {
DirLineIndex hist;
Counter counter;
} DirInfo deriving(Bits, Eq);
module mkDirectionPredictor(DirectionPred);
// Direction predictor state
Reg#(DirLineIndex)
hist <- mkReg(0);
RegFile#(DirLineIndex,Counter) cntArray <- mkRegFileFull();
April 23, 2012
http://csg.csail.mit.edu/6.S078
L19-14
Global history predictor
method ActionValue#(Tuple2#(Bool, DirInfo))
predict(Addr addr);
DirInfo info = ?;
Calculate feedback
info.hist = hist;
information
info.counter = cntArray.sub(hist);
Bit#(1) pred = truncate(info.counter >> 1);
hist <= truncate(hist << 1 | zeroExtend(pred));
return tuple2((pred == 1), info);
endmethod
Shift new
prediction into
history register
How good are predictions while waiting for training?
April 23, 2012
http://csg.csail.mit.edu/6.S078
L19-15
Global history predictor
method Action train(DirInfo info, Bool taken);
counterArray.upd(info.hist,
updateCounter(taken, info.counter));
endmethod
method Action repair(DirInfo info, Bool taken);
hist <= truncate((info.hist << 1)
| zeroExtend(pack(taken)));
endmethod
endmodule
April 23, 2012
Restore history
to state it would
be in after the
desired prediction
What is the state of ‘hist’ after
redirects from decode and execute?
http://csg.csail.mit.edu/6.S078
L19-16
NA pred with decode feedback
Fetch
x
f
Reg
Read
Decode
Execute
Memory
Writeback
d
f
F
f
r
D
d
r
R
r
r
X
x
r
M
m
r
W
Next
Address
Prediction
Direction
Prediction
April 23, 2012
http://csg.csail.mit.edu/6.S078
L19-17
Direction prediction recipe
Execute


Send redirects on mispredicts (unchanged)
Send direction prediction training
Decode


Check if next address matches direction pred
Send redirect if different
Fetch



April 23, 2012
Generate prediction
Learn from feedback
Accept redirects from later stages
http://csg.csail.mit.edu/6.S078
L19-18
Add direction feedback
typedef struct {
Bool correct;
NaInfo naPredInfo;
Addr nextAddr;
DirInfo dirPredInfo;
Bool taken;
} Feedback deriving (Bits, Eq);
Feedback needs
information for training
direction predictor
FIFOF#(Tuple3#(Epoch,Epoch,Feedback)) decFeedback<-mkFIFOF;
FIFOF#(Tuple2#(Epoch,Feedback)) execFeedback <- mkFIFOF;
April 23, 2012
http://csg.csail.mit.edu/6.S078
L19-19
Execute (branch analysis)
// after executing instruction...
let nextEeEpoch = eeEpoch;
let cond = execData.execInst.cond;
let nextPc = cond?execData.execInst.addr : execData.pc+4;
if (nextPC != execData.nextAddrPred) nextEeEpoch += 1;
Recall: may have
eeEpoch <= newEeEpoch;
been set in decode
execFeedback.enq(tuple2(nextEeEpoch,
Feedback{correct: (nextPC == execData.nextAddrPred),
taken: cond,
dirPredInfo: execData.dirPredInfo,
naPredInfo: execData.naPredInfo,
Always send
nextAddr: nextPc}));
feedback
// enqueue instruction to next stage
April 23, 2012
http://csg.csail.mit.edu/6.S078
L19-20
Decode with mispredict detect
New exec epoch
rule doDecode;
let decData = newDecData(fr.first);
let correctPath = (decData.execEpoch != deEpoch)
||(decData.decEpoch == ddEpoch);
Same dec epoch
let instResp = decData.fInst.instResp;
let pcPlus4 = decData.pc+4;
Determine if
epoch of incoming
if (correctPath)
instruction is on
good path
begin
decData.decInst = decode(instResp, pcPlus4);
let target = knownTargetAddr(decData.decInst);
let brClass = getBrClass(decData.decInst);
let predTarget = decData.nextAddrPred;
let predDir = decData.takenPred;
April 23, 2012
http://csg.csail.mit.edu/6.S078
L19-21
Decode with mispredict detect
let decodedTarget = case (brClass)
Calculate target as
NonBranch: pcPlus4;
best as decode can
UncondKnown: target;
CondBranch: (predDir?target:pcPlus4);
default: decData.nextAddrPred; endcase;
Wrong next addr?
if (decodedTarget != predTarget) begin
decData.decEpoch = decData.decEpoch + 1;
New dec epoch
decData.nextAddrPred = decodedTarget;
Tell exec addr of
decFeedback.enq(
next instruction!
tuple3(decData.execEpoch, decData.decEpoch,
Feedback{correct: False,
naPredInfo: decData.naPredInfo,
nextAddr: decodedTarget,
Send feedback
dirPredInfo: decData.dirPredInfo,
taken: decData.takenPred}));
end
Enqueue to next
dr.enq(decData); end // of correct path
stage on correct path
April 23, 2012
http://csg.csail.mit.edu/6.S078
L19-22
Decode with mispredict detect
else
begin // incorrect path
decData.decEpoch = ddEpoch;
decData.execEpoch = deEpoch;
end
ddEpoch <= decData.decEpoch;
deEpoch <= decData.execEpoch;
fr.deq;
endrule
April 23, 2012
Preserve current
epoch if instruction
on incorrect path
decData.*Epoch have been set
properly so we always save them.
http://csg.csail.mit.edu/6.S078
L19-23
Handling redirect from execute
if (execFeedback.notEmpty) begin
match {.execEpoch, .fb} = execFeedback.first;
execFeedback.deq;
if(!fb.correct) begin
dirPred.repair(fb.dirPredInfo, fb.taken);
dirPred.train(fb.dirPredInfo, fb.taken);
naPred.repair(fb.naPredInfo, fb.nextAddr);
naPred.train(fb.naPredInfo, fb.nextAddr);
feEpoch <= execEpoch;
Train and repair
fetchPc <= feedback.nextAddr;
on redirect
end else begin
dirPred.train(fb.dirPredInfo, fb.taken);
naPred.train(fb.naPredInfo, fb.nextAddr);
enqInst;
Just train on
end
correct prediction
end
April 23, 2012
http://csg.csail.mit.edu/6.S078
L19-24
Handling redirect from decode
else if (decFeedback.notEmpty) begin
decFeedback.deq;
match {.execEpoch, .decEpoch, .fb} = decFeedback.first;
if (execEpoch == feEpoch) begin
if (!fb.correct) begin // epoch unchanged
fdEpoch <= decEpoch;
dirPred.repair(fb.dirPredInfo, fb.taken);
naPred.repair(fb.naPredInfo, fb.nextAddr);
fetchPc <= feedback.nextAddr;
Just repair
end
never train
else // dec feedback on correct prediction
on feedback
enqInst;
from decode
end
else // dec feedback, but in fetch is in new exec epoch
enqInst;
else // no feedback
enqInst;
http://csg.csail.mit.edu/6.S078
L19-25
April 23, 2012
Download