TY - GEN
T1 - Parallel sorting on cache-coherent DSM multiprocessors
AU - Shan, Hongzhang
AU - Singh, Jaswinder Pal
N1 - Publisher Copyright:
© 1999 IEEE.
PY - 1999
Y1 - 1999
N2 - The performance of parallel sorting is not well understood on hardware cache-coherent shared address space (CC-SAS) multiprocessors, which increasingly dominate the market for tightly-coupled multiprocessing. We study two high-performance parallel sorting algorithms, radix and sample sorting, under three major programming models-a load-store CC-SAS, message passing, and the segmented SHMEM model-on a 64-processor SGI Origin2000. We observe surprisingly good speedups on this demanding application. The performance of radix sort is greatly affected by the programming model and particular implementation used. Sample sort exhibits more uniform performance across programming models on this platform, but it is usually not so good as that of the best radix sort for larger data sets if each is allowed to use the best programming model for itself. The best combination of algorithm and programming model is radix sorting under the SHMEM model for larger data sets and sample sorting under CC-SAS for smaller data sets.
AB - The performance of parallel sorting is not well understood on hardware cache-coherent shared address space (CC-SAS) multiprocessors, which increasingly dominate the market for tightly-coupled multiprocessing. We study two high-performance parallel sorting algorithms, radix and sample sorting, under three major programming models-a load-store CC-SAS, message passing, and the segmented SHMEM model-on a 64-processor SGI Origin2000. We observe surprisingly good speedups on this demanding application. The performance of radix sort is greatly affected by the programming model and particular implementation used. Sample sort exhibits more uniform performance across programming models on this platform, but it is usually not so good as that of the best radix sort for larger data sets if each is allowed to use the best programming model for itself. The best combination of algorithm and programming model is radix sorting under the SHMEM model for larger data sets and sample sorting under CC-SAS for smaller data sets.
UR - http://www.scopus.com/inward/record.url?scp=85031463280&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85031463280&partnerID=8YFLogxK
U2 - 10.1109/SC.1999.10045
DO - 10.1109/SC.1999.10045
M3 - Conference contribution
AN - SCOPUS:85031463280
T3 - ACM/IEEE SC 1999 Conference, SC 1999
SP - 40
BT - ACM/IEEE SC 1999 Conference, SC 1999
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 1999 ACM/IEEE Conference on Supercomputing, SC 1999
Y2 - 13 November 1999 through 19 November 1999
ER -