TY - GEN
T1 - A New Information Complexity Measure for Multi-pass Streaming with Applications
AU - Braverman, Mark
AU - Garg, Sumegha
AU - Li, Qian
AU - Wang, Shuo
AU - Woodruff, David P.
AU - Zhang, Jiapeng
N1 - Publisher Copyright:
© 2024 Owner/Author.
PY - 2024/6/10
Y1 - 2024/6/10
N2 - We introduce a new notion of information complexity for multi-pass streaming problems and use it to resolve several important questions in data streams. In the coin problem, one sees a stream of n i.i.d. uniformly random bits and one would like to compute the majority with constant advantage. We show that any constant-pass algorithm must use Ω(logn) bits of memory, significantly extending an earlier Ω(logn) bit lower bound for single-pass algorithms of Braverman-Garg-Woodruff (FOCS, 2020). This also gives the first Ω(logn) bit lower bound for the problem of approximating a counter up to a constant factor in worst-case turnstile streams for more than one pass. In the needle problem, one either sees a stream of n i.i.d. uniform samples from a domain [t], or there is a randomly chosen needle α ∈[t] for which each item independently is chosen to equal α with probability p, and is otherwise uniformly random in [t]. The problem of distinguishing these two cases is central to understanding the space complexity of the frequency moment estimation problem in random order streams. We show tight multi-pass space bounds for this problem for every p < 1/√n log3 n, resolving an open question of Lovett and Zhang (FOCS, 2023); even for 1-pass our bounds are new. To show optimality, we improve both lower and upper bounds from existing results. Our information complexity framework significantly extends the toolkit for proving multi-pass streaming lower bounds, and we give a wide number of additional streaming applications of our lower bound techniques, including multi-pass lower bounds for ℓp-norm estimation, ℓp-point query and heavy hitters, and compressed sensing problems.
AB - We introduce a new notion of information complexity for multi-pass streaming problems and use it to resolve several important questions in data streams. In the coin problem, one sees a stream of n i.i.d. uniformly random bits and one would like to compute the majority with constant advantage. We show that any constant-pass algorithm must use Ω(logn) bits of memory, significantly extending an earlier Ω(logn) bit lower bound for single-pass algorithms of Braverman-Garg-Woodruff (FOCS, 2020). This also gives the first Ω(logn) bit lower bound for the problem of approximating a counter up to a constant factor in worst-case turnstile streams for more than one pass. In the needle problem, one either sees a stream of n i.i.d. uniform samples from a domain [t], or there is a randomly chosen needle α ∈[t] for which each item independently is chosen to equal α with probability p, and is otherwise uniformly random in [t]. The problem of distinguishing these two cases is central to understanding the space complexity of the frequency moment estimation problem in random order streams. We show tight multi-pass space bounds for this problem for every p < 1/√n log3 n, resolving an open question of Lovett and Zhang (FOCS, 2023); even for 1-pass our bounds are new. To show optimality, we improve both lower and upper bounds from existing results. Our information complexity framework significantly extends the toolkit for proving multi-pass streaming lower bounds, and we give a wide number of additional streaming applications of our lower bound techniques, including multi-pass lower bounds for ℓp-norm estimation, ℓp-point query and heavy hitters, and compressed sensing problems.
KW - Information Complexity
KW - Streaming Algorithms
UR - http://www.scopus.com/inward/record.url?scp=85196658862&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85196658862&partnerID=8YFLogxK
U2 - 10.1145/3618260.3649672
DO - 10.1145/3618260.3649672
M3 - Conference contribution
AN - SCOPUS:85196658862
T3 - Proceedings of the Annual ACM Symposium on Theory of Computing
SP - 1781
EP - 1792
BT - STOC 2024 - Proceedings of the 56th Annual ACM Symposium on Theory of Computing
A2 - Mohar, Bojan
A2 - Shinkar, Igor
A2 - O�Donnell, Ryan
PB - Association for Computing Machinery
T2 - 56th Annual ACM Symposium on Theory of Computing, STOC 2024
Y2 - 24 June 2024 through 28 June 2024
ER -