Studies done with academic CC-NUMA machines and simulators indicate a good potential for application performance. Our goal therefore, is to investigate whether the CONVEX Exemplar, a commercial distributed shared memory machine, lives up to the expected potential of CC-NUMA machines. If not, we would like to understand what architectural or implementation decisions make it less efficient. On evaluating the delivered performance on the Exemplar, we find that, while a moderate-scale Exemplar machine works well for several applications, it does not for some important classes. Further, performance was affected by four fundamental characteristics of the machine, all of which are due to basic implementation and design choices made on the Exemplar These are: the effect of processor clustering together with limited node-to-network bandwidth, the effect of tertiary caches, the limited user control over data placement, the sequential memory consistency model together with a cache-based cache coherence protocol, and lastly, longer remote latencies.
|Original language||English (US)|
|Number of pages||10|
|Journal||Proceedings of the International Parallel Processing Symposium, IPPS|
|State||Published - Jan 1 1997|
All Science Journal Classification (ASJC) codes
- Hardware and Architecture