Fast, Scalable, and Machine-Verified Multicore Disjoint Set Union Data Structures and their Wide Deployment in Parallel Algorithms (Abstract)

Siddhartha Jayanti, Robert Tarjan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We design a simple, fast, scalable, and reliable concurrent disjoint set union (a.k.a. union-find) data structure. Our algorithms are the first scalable algorithms for the problem. Our best algorithm provides almost-linear speed-up, performing just Θ (m · (log (npm + 1) + α (n, npm) )) work when p processes execute a total of m operations on an instance with sets of total size n. We give a rigorous, machine-verified proof of correctness, and we prove that the work-complexity is optimal amongst a class of symmetric algorithms, which include all known concurrent set union algorithms. Our algorithms are fast in practice and have seen wide adoption. They are implemented in Google’s recently open-sourced graph-mining library, where they enable “parallel clustering algorithms which scale to graphs with tens of billions of edges” [5]. An MIT research group (Dhulipala, Hong, and Shun) independently implemented hundreds of parallel algorithms for connected components and revealed that our algorithms are consistently the fastest both on CPUs [2] and GPUs [6]. As an illustration, our algorithms were used to compute the components of the Hyperlink2012 graph of 128 billion edges in just 8.2 seconds on a standard 72 core machine; this is 3.1× faster than the state-of-the-art in any computational setting [2]. Several state-of-the-art algorithms for parallel clustering [5, 15, 16], graph analysis [2, 14], and model checking [1] rely on our data structures.

Original languageEnglish (US)
Title of host publicationHOPC 2024 - Proceedings of the 2024 ACM Workshop on Highlights of Parallel Computing
PublisherAssociation for Computing Machinery, Inc
Pages27-28
Number of pages2
ISBN (Electronic)9798400707001
DOIs
StatePublished - Jun 17 2024
Event2024 ACM Workshop on Highlights of Parallel Computing, HOPC 2024 - Nantes, France
Duration: Jun 17 2024 → …

Publication series

NameHOPC 2024 - Proceedings of the 2024 ACM Workshop on Highlights of Parallel Computing

Conference

Conference2024 ACM Workshop on Highlights of Parallel Computing, HOPC 2024
Country/TerritoryFrance
CityNantes
Period6/17/24 → …

All Science Journal Classification (ASJC) codes

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Hardware and Architecture
  • Software
  • Modeling and Simulation

Fingerprint

Dive into the research topics of 'Fast, Scalable, and Machine-Verified Multicore Disjoint Set Union Data Structures and their Wide Deployment in Parallel Algorithms (Abstract)'. Together they form a unique fingerprint.

Cite this