CPS 5401
In-class Activity
Name ________________________________
Roofline Model and Code Balance
A quadcore computer has a measured memory bandwidth of 2 Gword/s and a peak
performance of 8 Gflops/s. We only consider floating point operations.
1. Draw the roofline graph for this computer on a log-log graph.
2. For this computer, characterize the applications that would not benefit from an
increase in the computer’s theoretical peak performance.
3. An application is balanced for a computer if Tmem  Tcomp . A balanced
computation is also said to be compute-bound, while an imbalanced computation is
memory-bound. Every application will have a corresponding dot in the roofline
graph based on its operational intensity. Where in the graph will we find the dots for
the applications that are balanced?
For questions 4, 5, and 6, consider the following pseudo code for a snippet of a
stencil computation that operates on an n x n array stored in row-major order.
Assume each array element occupies one memory word.
for i = 1 to n-2
for j = 1 to n-2
x[i][j] += x[i][j+1] + 2*x[i+1][j] + x[i][j-1] + 2*x[i-1][j]
4. Assuming the entire array will fit in the computer’s cache, what is the operational
intensity of the code and is it compute-bound or memory-bound on the computer
described above?
5. Assuming that only two rows of the array can fit in the cache and that the cache
uses a least recently used (LRU) replacement policy, what is the computational
intensity of the code and is it compute-bound or memory-bound?
6. Suppose that the theoretical peak performance is quadrupled (e.g., by
quadrupling the number of cores) but that the effective memory bandwidth is only
doubled. How would these changes affect your answers to questions 4 and 5?

CPS 5401 Name In-class Activity Roofline Model and Code Balance