Reorder Buffer lab NAME زياد بديع السعيد البنا زياد حماده العبد ID 18010717 18010722 • Instruction sets show strengths of the reorder buffer technique.: - • After 4 cycles and load1 write result • After 5 cycles load1 commit and load2 write back • After 8 cycles load 2 commit and add write result • After 9 cycles add commit and subtract write result • After 19 cycles mult write back • After 20 cycles mult commit • After 50 cycles divide write back The instruction set using Reorder Buffer = 51 cycle - The instruction set theoretically on pipelined processors= 48 cycle • the reorder buffer does not help. • After 4 cycles and load1 write result • After 5 cycles load1 commit and load2 write back • After 8 cycles add write back but it can’t commit until all previous instructions commit • After 8 cycles subtract write back but it can’t commit until all previous instructions commit • After 46 cycle divide write back • After 57 cycle mult commit • Finally add and subtract can commit -The instruction set using Reorder Buffer = 60 cycle - The instruction set theoretically on pipelined processors= 45 cycle Cache Blocking االسم /زياد حمادة العبد الرقم الجامعى18010722 / االسم /زياد بديع البنا الرقم الجامعى18010717 / Assumption: 1) We did the analysis for block sizes (1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024) 2) For each block size we compare the execution time of both methods for different matrices sizes 3) Max Size of matrices was 2048 Comparison: Block size = 1 Block size = 2 Block size = 4 Block size = 8 Block size = 16 Block size = 32 Block size = 64 Block size = 128 Block size = 256 Block size = 512 Block size = 1024 Comments: - It is clear that with the increase of Block size will increase the performance of multiplication. - After a certain limit the performance will start to decrease with the increase of Block size. Code: Using GPU in Matrix Operations االسم /زياد حمادة العبد الرقم الجامعى18010722 / االسم /زياد بديع البنا الرقم الجامعى18010717 / Assumption: 1) In this analysis, matrix initialization is not considered although initialization with GPU is 10 times faster than CPU 2) we did the analysis for a big interval of sizes that the difference between them is only 10 3) Max Size was 10000 Analysis: 1) In small sizes, the performance is the same 2) In large sizes, the GPU performance is better than CPU Code: