236381 - VLSI Project

236381 - VLSI Project
A Perceptron based Branch Prediction study
Under the supervision of Michael Behar
Performed by
Apfelbaum Marcel
Zlotnik Alexander
Based on: Neural
Rutgers University
Methods for Dynamic Branch Prediction
The University of Texas at Austin
Branch prediction is one of the main objectives of processor designers for the last 20
years. Branch commands are 20% of assembler commands run by the processor.
Correct prediction of a branch command saves time and effort of the processor and
high branch prediction rates lead to faster programs execution and better Power
Management. Both of these issues are highly "fashionable".
Well accepted structures for dynamic branch predictors are based on two-bit counters.
In this project, we explore a specific type of artificial neurons and propose a scheme
that uses perceptrons instead of two-bit counters. Because the size of perceptrons
scales linearly with the size of their inputs, which in our case is the branch history, a
perceptron predictor can exploit long history lengths. By contrast, traditional twolevel adaptive schemes use pattern history tables (PHTs), which are indexed by the
branch history and which therefore grow exponentially with the history length. Thus
the PHT structure limits the length of the history register to the logarithm of the
number of counters. As a result, for the same hardware budget, a perceptron predictor
can consider much longer histories than PHT-based schemes. For example, for a 4 KB
hardware budget, a PHTbased predictor can use a history length of 14, whereas a
version of the perceptron predictor can use a history length of 34. These longer
history lengths lead to higher accuracy.
A perceptron is a learning device that takes a set of input values and combines them
with a set of weights (which are learned through training) to produce an output value.
In our predictor, each weight represents the degree of correlation between the
behavior of a past branch and the behavior of the branch being predicted. Positive
weights represent positive correlation, and negative weights represent negative
correlation. To make a prediction, each weight contributes in proportion to its
magnitude in the following manner. If its corresponding branch was taken, we add the
weight; otherwise we subtract the weight. If the resulting sum is positive, we predict
taken; otherwise we predict not taken. The perceptrons are trained by an algorithm
that increments a weight when the branch outcome agrees with the weight’s
correlation and decrements the weight otherwise.
Study Proposal
Our project is comprised from two parts:
1. Implementation of perceptron based branch prediction mechanism.
We will take a known processor simulator – SimpleScalar (http://simplescalar.com)
and add it a neural networks branch predicting mechanism. The aim is to explore the
practicability and applicability of such mechanism and explore its success rates.
2. Research of a proposed idea regarding perceptron based branch prediction.
Basing on Jimenez's study of perceptron based branch prediction, one of the largest
advantages of perceptrons over two-level prediction is its memory saving. Perceptron
based prediction consumes memory linear to the length of history register in table as
opposed to the exponential correlation of memory size consumed in two-level
This leads to much longer history considerations in perceptrons. Our suggestion is
that prediction of certain branch using perceptron is highly dependent on some of the
other branches, according to program flow, and much less on others. This should be
seen in the weights of a given perceptron. We will perform a research of this
Design and Implementation
For every branch there is an entry in a table. The entry holds a vector of weights and
is multiplied (Cartesian product) with the global history register.
Every time we meet a conditional branch command, we compute the product and
make a prediction. Basing on the prediction we fetch instructions and move the
branch command to the next pipe stage. After the branch condition calculation we
check whether we made a correct prediction. If no, we flush the read commands form
the pipe, reset the PC counter and continue form correct location. In any case, we train
the peprceptron. I.e. we update the predictor according to the history register at the
time of prediction. The SimpleScalar calculates prediction rates.
Check process: We have a known benchmark (Spec 2000) and will run our tests on it.
We will compare the results of runs with two-level branch predictors with perceptron
based predictors. To check correctness we will compare program outputs. They
should be identical since branch prediction should not change functionality. If results
are identical, we will compare prediction rates between both.
In addition we will do statistic checks of weights size of different perceptrons and try
to prove or dismay our suggestion.
1. Study the background.
2. Install SimpleScalar and get familiar with it.
3. Code a "dummy" predictor and use it to be sure that we understand
how branch prediction is handled in the SimpleScalar platform.
4. Code the perceptron predictor itself. (Yet to be checked)
To be done:
5. Benchmarking.
6. A special study of our suggestion regarding perceptron predictor