CPS 5401 In-class Activity Name ________________________________ Roofline Model and Code Balance A quadcore computer has a measured memory bandwidth of 2 Gword/s and a peak performance of 8 Gflops/s. We only consider floating point operations. 1. Draw the roofline graph for this computer on a log-log graph. 2. For this computer, characterize the applications that would not benefit from an increase in the computer’s theoretical peak performance. 3. An application is balanced for a computer if Tmem Tcomp . A balanced computation is also said to be compute-bound, while an imbalanced computation is memory-bound. Every application will have a corresponding dot in the roofline graph based on its operational intensity. Where in the graph will we find the dots for the applications that are balanced? For questions 4, 5, and 6, consider the following pseudo code for a snippet of a stencil computation that operates on an n x n array stored in row-major order. Assume each array element occupies one memory word. for i = 1 to n-2 for j = 1 to n-2 x[i][j] += x[i][j+1] + 2*x[i+1][j] + x[i][j-1] + 2*x[i-1][j] 4. Assuming the entire array will fit in the computer’s cache, what is the operational intensity of the code and is it compute-bound or memory-bound on the computer described above? 5. Assuming that only two rows of the array can fit in the cache and that the cache uses a least recently used (LRU) replacement policy, what is the computational intensity of the code and is it compute-bound or memory-bound? 6. Suppose that the theoretical peak performance is quadrupled (e.g., by quadrupling the number of cores) but that the effective memory bandwidth is only doubled. How would these changes affect your answers to questions 4 and 5?

Download
# CPS 5401 Name In-class Activity Roofline Model and Code Balance