Computer Architecture Lab Report: Reorder Buffer, Cache, GPU

‫‪Reorder Buffer lab‬‬ ‫‪NAME‬‬ ‫زياد بديع السعيد البنا‬ ‫زياد حماده العبد‬ ‫‪ID‬‬ ‫‪18010717‬‬ ‫‪18010722‬‬ • Instruction sets show strengths of the reorder buffer technique.: - • After 4 cycles and load1 write result • After 5 cycles load1 commit and load2 write back • After 8 cycles load 2 commit and add write result • After 9 cycles add commit and subtract write result • After 19 cycles mult write back • After 20 cycles mult commit • After 50 cycles divide write back The instruction set using Reorder Buffer = 51 cycle - The instruction set theoretically on pipelined processors= 48 cycle • the reorder buffer does not help. • After 4 cycles and load1 write result • After 5 cycles load1 commit and load2 write back • After 8 cycles add write back but it can’t commit until all previous instructions commit • After 8 cycles subtract write back but it can’t commit until all previous instructions commit • After 46 cycle divide write back • After 57 cycle mult commit • Finally add and subtract can commit -The instruction set using Reorder Buffer = 60 cycle - The instruction set theoretically on pipelined processors= 45 cycle ‫‪Cache Blocking‬‬ ‫االسم‪ /‬زياد حمادة العبد‬ ‫الرقم الجامعى‪18010722 /‬‬ ‫االسم‪ /‬زياد بديع البنا‬ ‫الرقم الجامعى‪18010717 /‬‬ Assumption: 1) We did the analysis for block sizes (1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024) 2) For each block size we compare the execution time of both methods for different matrices sizes 3) Max Size of matrices was 2048 Comparison: Block size = 1 Block size = 2 Block size = 4 Block size = 8 Block size = 16 Block size = 32 Block size = 64 Block size = 128 Block size = 256 Block size = 512 Block size = 1024 Comments: - It is clear that with the increase of Block size will increase the performance of multiplication. - After a certain limit the performance will start to decrease with the increase of Block size. Code: ‫‪Using GPU in Matrix Operations‬‬ ‫االسم‪ /‬زياد حمادة العبد‬ ‫الرقم الجامعى‪18010722 /‬‬ ‫االسم‪ /‬زياد بديع البنا‬ ‫الرقم الجامعى‪18010717 /‬‬ Assumption: 1) In this analysis, matrix initialization is not considered although initialization with GPU is 10 times faster than CPU 2) we did the analysis for a big interval of sizes that the difference between them is only 10 3) Max Size was 10000 Analysis: 1) In small sizes, the performance is the same 2) In large sizes, the GPU performance is better than CPU Code:

Computer Architecture Lab Report: Reorder Buffer, Cache, GPU

Related documents

Products

Support

Computer Architecture Lab Report: Reorder Buffer, Cache, GPU

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib