TY - GEN
T1 - Fast, Scalable, and Machine-Verified Multicore Disjoint Set Union Data Structures and their Wide Deployment in Parallel Algorithms (Abstract)
AU - Jayanti, Siddhartha
AU - Tarjan, Robert
N1 - Publisher Copyright:
© 2024 Copyright held by the owner/author(s).
PY - 2024/6/17
Y1 - 2024/6/17
N2 - We design a simple, fast, scalable, and reliable concurrent disjoint set union (a.k.a. union-find) data structure. Our algorithms are the first scalable algorithms for the problem. Our best algorithm provides almost-linear speed-up, performing just Θ (m · (log (npm + 1) + α (n, npm) )) work when p processes execute a total of m operations on an instance with sets of total size n. We give a rigorous, machine-verified proof of correctness, and we prove that the work-complexity is optimal amongst a class of symmetric algorithms, which include all known concurrent set union algorithms. Our algorithms are fast in practice and have seen wide adoption. They are implemented in Google’s recently open-sourced graph-mining library, where they enable “parallel clustering algorithms which scale to graphs with tens of billions of edges” [5]. An MIT research group (Dhulipala, Hong, and Shun) independently implemented hundreds of parallel algorithms for connected components and revealed that our algorithms are consistently the fastest both on CPUs [2] and GPUs [6]. As an illustration, our algorithms were used to compute the components of the Hyperlink2012 graph of 128 billion edges in just 8.2 seconds on a standard 72 core machine; this is 3.1× faster than the state-of-the-art in any computational setting [2]. Several state-of-the-art algorithms for parallel clustering [5, 15, 16], graph analysis [2, 14], and model checking [1] rely on our data structures.
AB - We design a simple, fast, scalable, and reliable concurrent disjoint set union (a.k.a. union-find) data structure. Our algorithms are the first scalable algorithms for the problem. Our best algorithm provides almost-linear speed-up, performing just Θ (m · (log (npm + 1) + α (n, npm) )) work when p processes execute a total of m operations on an instance with sets of total size n. We give a rigorous, machine-verified proof of correctness, and we prove that the work-complexity is optimal amongst a class of symmetric algorithms, which include all known concurrent set union algorithms. Our algorithms are fast in practice and have seen wide adoption. They are implemented in Google’s recently open-sourced graph-mining library, where they enable “parallel clustering algorithms which scale to graphs with tens of billions of edges” [5]. An MIT research group (Dhulipala, Hong, and Shun) independently implemented hundreds of parallel algorithms for connected components and revealed that our algorithms are consistently the fastest both on CPUs [2] and GPUs [6]. As an illustration, our algorithms were used to compute the components of the Hyperlink2012 graph of 128 billion edges in just 8.2 seconds on a standard 72 core machine; this is 3.1× faster than the state-of-the-art in any computational setting [2]. Several state-of-the-art algorithms for parallel clustering [5, 15, 16], graph analysis [2, 14], and model checking [1] rely on our data structures.
UR - http://www.scopus.com/inward/record.url?scp=85200558175&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85200558175&partnerID=8YFLogxK
U2 - 10.1145/3670684.3673405
DO - 10.1145/3670684.3673405
M3 - Conference contribution
AN - SCOPUS:85200558175
T3 - HOPC 2024 - Proceedings of the 2024 ACM Workshop on Highlights of Parallel Computing
SP - 27
EP - 28
BT - HOPC 2024 - Proceedings of the 2024 ACM Workshop on Highlights of Parallel Computing
PB - Association for Computing Machinery, Inc
T2 - 2024 ACM Workshop on Highlights of Parallel Computing, HOPC 2024
Y2 - 17 June 2024
ER -