TY - GEN
T1 - Near-optimal two-pass streaming algorithm for sampling random walks over directed graphs
AU - Chen, Lijie
AU - Kol, Gillat
AU - Paramonov, Dmitry
AU - Saxena, Raghuvansh R.
AU - Song, Zhao
AU - Yu, Huacheng
N1 - Publisher Copyright:
© 2021 Lijie Chen, Gillat Kol, Dmitry Paramonov, Raghuvansh R. Saxena, Zhao Song, and Huacheng Yu.
PY - 2021/7/1
Y1 - 2021/7/1
N2 - For a directed graph G with n vertices and a start vertex ustart, we wish to (approximately) sample an L-step random walk over G starting from ustart with minimum space using an algorithm that only makes few passes over the edges of the graph. This problem found many applications, for instance, in approximating the PageRank of a webpage. If only a single pass is allowed, the space complexity of this problem was shown to be Θ(n · L). Prior to our work, a better space complexity was only known with Õ(√L) passes. We essentially settle the space complexity of this random walk simulation problem for two-pass streaming algorithms, showing that it is Θ(n · √L), by giving almost matching upper and lower bounds. Our lower bound argument extends to every constant number of passes p, and shows that any p-pass algorithm for this problem uses Ω(n · L1/p) space. In addition, we show a similar Θ(n · √L) bound on the space complexity of any algorithm (with any number of passes) for the related problem of sampling an L-step random walk from every vertex in the graph.
AB - For a directed graph G with n vertices and a start vertex ustart, we wish to (approximately) sample an L-step random walk over G starting from ustart with minimum space using an algorithm that only makes few passes over the edges of the graph. This problem found many applications, for instance, in approximating the PageRank of a webpage. If only a single pass is allowed, the space complexity of this problem was shown to be Θ(n · L). Prior to our work, a better space complexity was only known with Õ(√L) passes. We essentially settle the space complexity of this random walk simulation problem for two-pass streaming algorithms, showing that it is Θ(n · √L), by giving almost matching upper and lower bounds. Our lower bound argument extends to every constant number of passes p, and shows that any p-pass algorithm for this problem uses Ω(n · L1/p) space. In addition, we show a similar Θ(n · √L) bound on the space complexity of any algorithm (with any number of passes) for the related problem of sampling an L-step random walk from every vertex in the graph.
KW - Random walk sampling
KW - Streaming algorithms
UR - http://www.scopus.com/inward/record.url?scp=85113892206&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85113892206&partnerID=8YFLogxK
U2 - 10.4230/LIPIcs.ICALP.2021.52
DO - 10.4230/LIPIcs.ICALP.2021.52
M3 - Conference contribution
AN - SCOPUS:85113892206
T3 - Leibniz International Proceedings in Informatics, LIPIcs
BT - 48th International Colloquium on Automata, Languages, and Programming, ICALP 2021
A2 - Bansal, Nikhil
A2 - Merelli, Emanuela
A2 - Worrell, James
PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
T2 - 48th International Colloquium on Automata, Languages, and Programming, ICALP 2021
Y2 - 12 July 2021 through 16 July 2021
ER -