Parallel tree building on a range of shared address space multiprocessors: Algorithms and application performance

Hongzhang Shan, Jaswinder Pal Singh

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Scopus citations

Abstract

The performance of five parallel tree building methods in the context of a complete galaxy simulation on four very different platforms that support the coherent shared address space programming model is investigated. A proposed algorithm that uses a separate spatial partitioning of the domain for the tree building phase and eliminates locking at a significant cost in locality and load balance is found to be the best by far. By changing the tree building algorithm, improvements of more than factors of 4-40 on commodity-based systems are achieved in overall application performance even on only 16 processors. This allows commodity shared memory platforms to perform well for hierarchical N-body applications for the first time.

Original languageEnglish (US)
Title of host publicationProceedings of the International Parallel Processing Symposium, IPPS
PublisherIEEE Comp Soc
Pages475-484
Number of pages10
ISBN (Print)0818684046
DOIs
StatePublished - 1998
EventProceedings of the 1998 12th International Parallel Processing Symposium and 9th Symposium on Parallel and Distributed Processing - Orlando, FL, USA
Duration: Mar 30 1998Apr 3 1998

Conference

ConferenceProceedings of the 1998 12th International Parallel Processing Symposium and 9th Symposium on Parallel and Distributed Processing
CityOrlando, FL, USA
Period3/30/984/3/98

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Parallel tree building on a range of shared address space multiprocessors: Algorithms and application performance'. Together they form a unique fingerprint.

Cite this