Scalable speculative parallelization on commodity clusters

Hanjun Kim, Arun Raman, Feng Liu, Jae W. Lee, David I. August

Research output: Chapter in Book/Report/Conference proceedingConference contribution

22 Scopus citations


While clusters of commodity servers and switches are the most popular form of large-scale parallel computers, many programs are not easily parallelized for execution upon them. In particular, high inter-node communication cost and lack of globally shared memory appear to make clusters suitable only for server applications with abundant task-level parallelism and scientific applications with regular and independent units of work. Clever use of pipeline parallelism (DSWP), thread-level speculation (TLS), and speculative pipeline parallelism (Spec-DSWP) can mitigate the costs of inter-thread communication on shared memory multicore machines. This paper presents Distributed Software Multi-threaded Transactional memory (DSMTX), a runtime system which makes these techniques applicable to non-shared memory clusters, allowing them to efficiently address inter-node communication costs. Initial results suggest that DSMTX enables efficient cluster execution of a wider set of application types. For 11 sequential C programs parallelized for a 4-core 32-node (128 total core) cluster without shared memory, DSMTX achieves a geomean speedup of 49×. This compares favorably to the 15× speedup achieved by our implementation of TLS-only support for clusters.

Original languageEnglish (US)
Title of host publicationProceedings - 43rd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2010
Number of pages12
StatePublished - 2010
Event43rd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2010 - Atlanta, GA, United States
Duration: Dec 4 2010Dec 8 2010

Publication series

NameProceedings of the Annual International Symposium on Microarchitecture, MICRO
ISSN (Print)1072-4451


Other43rd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2010
Country/TerritoryUnited States
CityAtlanta, GA

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture


  • Distributed systems
  • Loop-level parallelism
  • Multi-threaded transactions
  • Pipelined parallelism
  • Software transactional memory
  • Thread-level speculation


Dive into the research topics of 'Scalable speculative parallelization on commodity clusters'. Together they form a unique fingerprint.

Cite this