Abstract
The performance of five parallel tree building methods in the context of a complete galaxy simulation on four very different platforms that support the coherent shared address space programming model is investigated. A proposed algorithm that uses a separate spatial partitioning of the domain for the tree building phase and eliminates locking at a significant cost in locality and load balance is found to be the best by far. By changing the tree building algorithm, improvements of more than factors of 4-40 on commodity-based systems are achieved in overall application performance even on only 16 processors. This allows commodity shared memory platforms to perform well for hierarchical N-body applications for the first time.
| Original language | English (US) |
|---|---|
| Title of host publication | Proceedings of the International Parallel Processing Symposium, IPPS |
| Publisher | IEEE Comp Soc |
| Pages | 475-484 |
| Number of pages | 10 |
| ISBN (Print) | 0818684046 |
| DOIs | |
| State | Published - 1998 |
| Event | Proceedings of the 1998 12th International Parallel Processing Symposium and 9th Symposium on Parallel and Distributed Processing - Orlando, FL, USA Duration: Mar 30 1998 → Apr 3 1998 |
Conference
| Conference | Proceedings of the 1998 12th International Parallel Processing Symposium and 9th Symposium on Parallel and Distributed Processing |
|---|---|
| City | Orlando, FL, USA |
| Period | 3/30/98 → 4/3/98 |
All Science Journal Classification (ASJC) codes
- Hardware and Architecture
Fingerprint
Dive into the research topics of 'Parallel tree building on a range of shared address space multiprocessors: Algorithms and application performance'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver