Dominic DiTomaso
EE 6633
•
•
• ATAC
• SPATL
• Cuckoo Directory
•
•
• Snoopy (broadcast) -> Directory (multicast)
• Large Directory Overhead
• Overhead = P*M
• P - # of processors
• M - # of memory blocks
• 64 nodes: 12.7% overhead
• 256 nodes: 50% overhead
• 1024 nodes: 200% overhead
M
P
• Requirements
• Small area, energy, and latency overheads
• Accurate sharer information
• Limited directory-induced invalidations
• Duplicate Tags
• Area-efficient
• High associativity -> high power
• Sparse Directory
•
•
Power-efficient
Large capacity -> large area
• Coarse-grain vectors, Hierarchical, etc.
• Optical Broadcast Network
• Tagless
• Bloom Filters
• N-ary Cuckoo Hash Table
• Variable directory tags
• Large directory overhead at 1000s of cores
• Solutions
•
•
•
• Optics – ATAC
Tagless – SPATL
Hash Tables – Cuckoo
Variable Tags – SCD
•
•
•
•
[1] George Kurian, Jason E. Miller, James Psota, Jonathan Eastep,
Jifeng Liu, Jurgen Michel, Lionel C. Kimerling, and Anant Agarwal,
“ATAC: a 1000-core cache-coherent processor with on-chip optical network,” In Proceedings of the 19th international conference on
Parallel architectures and compilation techniques (PACT '10), 2010.
[2] Daniel Sanchez and Christos Kozyrakis , “SCD: A scalable coherence directory with flexible sharer set encoding,” In Proceedings of the 2012 IEEE 18th International Symposium on High-Performance
Computer Architecture (HPCA '12), 2012.
[3] H. Zhao, A. Shriraman, S. Dwarkadas, and V. Srinivasan , “SPATL:
Honey, I Shrunk the Coherence Directory,” In Proceedings of the 20th international conference on Parallel architectures and compilation techniques (PACT ’11), 2011.
[4] M. Ferdman, P. Lotfi-Kamran, K. Balet, B. Falsafi, "Cuckoo directory: A scalable directory for many-core systems," 2011 IEEE
17th International Symposium on High Performance Computer
Architecture (HPCA) , pp.169-180, Feb. 2011.