Uploaded by m0hamed.3ed98

arch labs

advertisement
‫‪Reorder Buffer lab‬‬
‫‪NAME‬‬
‫زياد بديع السعيد البنا‬
‫زياد حماده العبد‬
‫‪ID‬‬
‫‪18010717‬‬
‫‪18010722‬‬
• Instruction sets show strengths of the reorder buffer
technique.: -
• After 4 cycles and load1 write result
• After 5 cycles load1 commit and load2 write back
• After 8 cycles load 2 commit and add write result
• After 9 cycles add commit and subtract write result
• After 19 cycles mult write back
• After 20 cycles mult commit
• After 50 cycles divide write back
The instruction set using Reorder Buffer = 51 cycle
- The instruction set theoretically on pipelined
processors= 48 cycle
• the reorder buffer does not help.
• After 4 cycles and load1 write result
• After 5 cycles load1 commit and load2 write back
• After 8 cycles add write back but it can’t commit
until all previous instructions commit
• After 8 cycles subtract write back but it can’t commit
until all previous instructions commit
• After 46 cycle divide write back
• After 57 cycle mult commit
• Finally add and subtract can commit
-The instruction set using Reorder Buffer = 60 cycle
- The instruction set theoretically on pipelined
processors= 45 cycle
‫‪Cache Blocking‬‬
‫االسم‪ /‬زياد حمادة العبد‬
‫الرقم الجامعى‪18010722 /‬‬
‫االسم‪ /‬زياد بديع البنا‬
‫الرقم الجامعى‪18010717 /‬‬
Assumption:
1) We did the analysis for block sizes (1, 2, 4, 8, 16, 32, 64, 128, 256, 512,
1024)
2) For each block size we compare the execution time of both methods for
different matrices sizes
3) Max Size of matrices was 2048
Comparison:
Block size = 1
Block size = 2
Block size = 4
Block size = 8
Block size = 16
Block size = 32
Block size = 64
Block size = 128
Block size = 256
Block size = 512
Block size = 1024
Comments:
- It is clear that with the increase of Block size will increase the
performance of multiplication.
- After a certain limit the performance will start to decrease with
the increase of Block size.
Code:
‫‪Using GPU in Matrix Operations‬‬
‫االسم‪ /‬زياد حمادة العبد‬
‫الرقم الجامعى‪18010722 /‬‬
‫االسم‪ /‬زياد بديع البنا‬
‫الرقم الجامعى‪18010717 /‬‬
Assumption:
1) In this analysis, matrix initialization is not considered although initialization
with GPU is 10 times faster than CPU
2) we did the analysis for a big interval of sizes that the difference between
them is only 10
3) Max Size was 10000
Analysis:
1) In small sizes, the performance is the same
2) In large sizes, the GPU performance is better than CPU
Code:
Download