dominic

advertisement

Scalable Directory

Protocols for 1000s of

Cores

Dominic DiTomaso

EE 6633

Outline

Introduction

Background

• ATAC

• SPATL

• Cuckoo Directory

SCD

Conclusions

Directory Protocols

• Snoopy (broadcast) -> Directory (multicast)

• Large Directory Overhead

• Overhead = P*M

• P - # of processors

• M - # of memory blocks

• 64 nodes: 12.7% overhead

• 256 nodes: 50% overhead

• 1024 nodes: 200% overhead

M

P

Directory Protocols

• Requirements

• Small area, energy, and latency overheads

• Accurate sharer information

• Limited directory-induced invalidations

• Duplicate Tags

• Area-efficient

• High associativity -> high power

• Sparse Directory

Power-efficient

Large capacity -> large area

• Coarse-grain vectors, Hierarchical, etc.

ATAC

• Optical Broadcast Network

SPATL

• Tagless

• Bloom Filters

Cuckoo Directory

• N-ary Cuckoo Hash Table

SCD

• Variable directory tags

SCD (cont.)

SCD (cont.)

Conclusions

• Large directory overhead at 1000s of cores

• Solutions

• Optics – ATAC

Tagless – SPATL

Hash Tables – Cuckoo

Variable Tags – SCD

References

[1] George Kurian, Jason E. Miller, James Psota, Jonathan Eastep,

Jifeng Liu, Jurgen Michel, Lionel C. Kimerling, and Anant Agarwal,

“ATAC: a 1000-core cache-coherent processor with on-chip optical network,” In Proceedings of the 19th international conference on

Parallel architectures and compilation techniques (PACT '10), 2010.

[2] Daniel Sanchez and Christos Kozyrakis , “SCD: A scalable coherence directory with flexible sharer set encoding,” In Proceedings of the 2012 IEEE 18th International Symposium on High-Performance

Computer Architecture (HPCA '12), 2012.

[3] H. Zhao, A. Shriraman, S. Dwarkadas, and V. Srinivasan , “SPATL:

Honey, I Shrunk the Coherence Directory,” In Proceedings of the 20th international conference on Parallel architectures and compilation techniques (PACT ’11), 2011.

[4] M. Ferdman, P. Lotfi-Kamran, K. Balet, B. Falsafi, "Cuckoo directory: A scalable directory for many-core systems," 2011 IEEE

17th International Symposium on High Performance Computer

Architecture (HPCA) , pp.169-180, Feb. 2011.

Download