Implementing application-specific cache-coherence protocols in configurable hardware

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Streamlining communication is key to achieving good performance in shared-memory parallel programs. While full hardware support for cache coherence generally offers the best performance, not all parallel machines provide it. Instead, software layers using Shared Virtual Memory (SVM) can be built to enforce coherence at a higher level. In prior work, researchers have studied application-specific cache coherence protocols implemented either in SVM systems or as handlers run by programmable protocol processors. Since the protocols are specialized to the needs of a single application, they can be particularly helpful in reducing the long latencies and processing overhead that sometimes degrade performance in SVM systems. This paper studies implementing application-specific protocols in hardware, but not via an instruction-based protocol processor as is typical. Instead, we consider configurable implementations based on Field-Programmable Gate Arrays (FPGAs). This approach can be faster than software-based techniques and less expensive than some hardware-based techniques. We study one application, appbt, in detail, including a VHDL-level design of the configurable protocol design. We sketch out approaches for other applications as well. Implementing protocol operations in configurable hardware improves communication performance by roughly 11X for a 32-node system. While overall speedups are a more modest 12%, our method is promising because of its flexibility and because it offers a new way of harnessing configurable hardware at the network interface, where it already exists or could be easily added to current systems

Original languageEnglish (US)
Title of host publicationNetwork-Based Parallel Computing
Subtitle of host publicationCommunication, Architecture, and Applications - 3rd International Workshop, CANPC 1999, Proceedings
EditorsAnand Sivasubramaniam, Mario Lauria
PublisherSpringer Verlag
Pages181-195
Number of pages15
ISBN (Print)3540659153, 9783540659150
DOIs
StatePublished - Jan 1 1999
Event3rd International Workshop on Communication, Architecture and Applications for Network-based Parallel Computing , CANPC 1999 - Orlando, United States
Duration: Jan 9 1999Jan 9 1999

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume1602
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other3rd International Workshop on Communication, Architecture and Applications for Network-based Parallel Computing , CANPC 1999
CountryUnited States
CityOrlando
Period1/9/991/9/99

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'Implementing application-specific cache-coherence protocols in configurable hardware'. Together they form a unique fingerprint.

  • Cite this

    Brooks, D., & Martonosi, M. R. (1999). Implementing application-specific cache-coherence protocols in configurable hardware. In A. Sivasubramaniam, & M. Lauria (Eds.), Network-Based Parallel Computing: Communication, Architecture, and Applications - 3rd International Workshop, CANPC 1999, Proceedings (pp. 181-195). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 1602). Springer Verlag. https://doi.org/10.1007/10704826_13