Application and architectural bottlenecks in large scale distributed shared memory machines

Chris Holt, Jaswinder Pal Singh, John Hennessy

Research output: Contribution to journalConference articlepeer-review

17 Scopus citations


A number of researchers have presented architectural techniques for scaling a cache coherent shared address space to much larger processor counts. The present paper examines the extent to which applications can achieve reasonable performance on such large-scale, cache-coherent, distributed shared address space machines, by determining the problems sizes needed to achieve a reasonable level of efficiency. It also looks at how much programming effort and optimization is needed to achieve high efficiency beyond that needed at small processor counts. For each application, the main architectural bottlenecks are discussed that prevent smaller problem sizes or less optimized programs from achieving good efficiency.

Original languageEnglish (US)
Pages (from-to)134-145
Number of pages12
JournalConference Proceedings - Annual International Symposium on Computer Architecture, ISCA
StatePublished - 1996
Externally publishedYes
EventProceedings of the 1996 23rd Annual International Symposium on Computer Architecture - Philadelphia, PA, USA
Duration: May 22 1996May 24 1996

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture


Dive into the research topics of 'Application and architectural bottlenecks in large scale distributed shared memory machines'. Together they form a unique fingerprint.

Cite this