TY - GEN

T1 - Fast, Scalable, and Machine-Verified Multicore Disjoint Set Union Data Structures and their Wide Deployment in Parallel Algorithms (Abstract)

AU - Jayanti, Siddhartha

AU - Tarjan, Robert

N1 - Publisher Copyright:
© 2024 Copyright held by the owner/author(s).

PY - 2024/6/17

Y1 - 2024/6/17

N2 - We design a simple, fast, scalable, and reliable concurrent disjoint set union (a.k.a. union-find) data structure. Our algorithms are the first scalable algorithms for the problem. Our best algorithm provides almost-linear speed-up, performing just Θ (m · (log (npm + 1) + α (n, npm) )) work when p processes execute a total of m operations on an instance with sets of total size n. We give a rigorous, machine-verified proof of correctness, and we prove that the work-complexity is optimal amongst a class of symmetric algorithms, which include all known concurrent set union algorithms. Our algorithms are fast in practice and have seen wide adoption. They are implemented in Google’s recently open-sourced graph-mining library, where they enable “parallel clustering algorithms which scale to graphs with tens of billions of edges” [5]. An MIT research group (Dhulipala, Hong, and Shun) independently implemented hundreds of parallel algorithms for connected components and revealed that our algorithms are consistently the fastest both on CPUs [2] and GPUs [6]. As an illustration, our algorithms were used to compute the components of the Hyperlink2012 graph of 128 billion edges in just 8.2 seconds on a standard 72 core machine; this is 3.1× faster than the state-of-the-art in any computational setting [2]. Several state-of-the-art algorithms for parallel clustering [5, 15, 16], graph analysis [2, 14], and model checking [1] rely on our data structures.

AB - We design a simple, fast, scalable, and reliable concurrent disjoint set union (a.k.a. union-find) data structure. Our algorithms are the first scalable algorithms for the problem. Our best algorithm provides almost-linear speed-up, performing just Θ (m · (log (npm + 1) + α (n, npm) )) work when p processes execute a total of m operations on an instance with sets of total size n. We give a rigorous, machine-verified proof of correctness, and we prove that the work-complexity is optimal amongst a class of symmetric algorithms, which include all known concurrent set union algorithms. Our algorithms are fast in practice and have seen wide adoption. They are implemented in Google’s recently open-sourced graph-mining library, where they enable “parallel clustering algorithms which scale to graphs with tens of billions of edges” [5]. An MIT research group (Dhulipala, Hong, and Shun) independently implemented hundreds of parallel algorithms for connected components and revealed that our algorithms are consistently the fastest both on CPUs [2] and GPUs [6]. As an illustration, our algorithms were used to compute the components of the Hyperlink2012 graph of 128 billion edges in just 8.2 seconds on a standard 72 core machine; this is 3.1× faster than the state-of-the-art in any computational setting [2]. Several state-of-the-art algorithms for parallel clustering [5, 15, 16], graph analysis [2, 14], and model checking [1] rely on our data structures.

UR - http://www.scopus.com/inward/record.url?scp=85200558175&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85200558175&partnerID=8YFLogxK

U2 - 10.1145/3670684.3673405

DO - 10.1145/3670684.3673405

M3 - Conference contribution

AN - SCOPUS:85200558175

T3 - HOPC 2024 - Proceedings of the 2024 ACM Workshop on Highlights of Parallel Computing

SP - 27

EP - 28

BT - HOPC 2024 - Proceedings of the 2024 ACM Workshop on Highlights of Parallel Computing

PB - Association for Computing Machinery, Inc

T2 - 2024 ACM Workshop on Highlights of Parallel Computing, HOPC 2024

Y2 - 17 June 2024

ER -