Conference Paper

A scalable cache coherent architecture for large-scalemesh-connected multiprocessors

Dept. of Comput. Sci., Korea Adv. Inst. of Sci. & Technol., Taejon
DOI: 10.1109/ISPAN.1997.645056 Conference: Parallel Architectures, Algorithms, and Networks, 1997. (I-SPAN '97) Proceedings., Third International Symposium on
Source: IEEE Xplore

ABSTRACT Until now, various limited directory-based cache coherence protocols were proposed for medium- or large-scale multiprocessors while employing scalable directory memories. For widely shared data, however, most protocols suffer from extraneous cache invalidates or updates due to insufficient pointers. We focus on large-scale mesh-connected multiprocessors built on top of wormhole and dimension ordered routing networks. In such networks, worms are major bricks for communications, which transit all the intermediate nodes on their way to a destination. From such an observation, we propose a new directory-based protocol DirQ with limited pointers, which can represent either one node or a set of nodes when being widely shared. For √N×√N processors system, our protocol needs Θ(N3/2 log N) bits for directory memory which is much more scalable compared to the full-map protocol. In terms of latency and traffic volume for cache coherence, our analytic models show that DirQ outperforms other limited protocols, and further comparable to the full-map one

  • [Show abstract] [Hide abstract]
    ABSTRACT: Nowadays, directory-based cache coherence protocols have been the best choice in designing large-scaled CC-NUMA systems. Directory contains the shared information of all memory blocks in system. The number of copies of shared data in system and their distributing characteristics are deeply influenced by the characteristics of application programs. In this paper, a Markov chains model is built for the distributing pattern of shared data in CC-NUMA systems. It's proved that, the average number of cache copies of shared data is small in CC-NUMA systems. Especially, when the percentage of shared read operations in applications is lower than 80%, this number is less than 5.
    Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing, 2007. SNPD 2007. Eighth ACIS International Conference on; 01/2007