TY - GEN
T1 - The coin problem with applications to data streams
AU - Braverman, Mark
AU - Garg, Sumegha
AU - Woodruff, David P.
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/11
Y1 - 2020/11
N2 - Consider the problem of computing the majority of a stream of n i.i.d. uniformly random bits. This problem, known as the coin problem, is central to a number of counting problems in different data stream models. We show that any streaming algorithm for solving this problem with large constant advantage must use Omega(log n) bits of space. We extend our lower bound to proving tight lower bounds for solving multiple, randomly interleaved copies of the coin problem, as well as for solving the OR of multiple copies of a variant of the coin problem. Our proofs involve new measures of information complexity that are well-suited for data streams. We use these lower bounds to obtain a number of new results for data streams. In each case there is an underlying d dimensional vector x with additive updates to its coordinates given in a stream of length m. The input streams arising from our coin lower bound have nice distributional properties, and consequently for many problems for which we only had lower bounds in general turnstile streams, we now obtain the same lower bounds in more natural models, such as the bounded deletion model, in which Vert x Vert {2} never drops by a constant fraction of what it was earlier, or in the random order model, in which the updates are ordered randomly. In particular, in the bounded deletion model, we obtain nearly tight lower bounds for approximating Vert x Vert {infty} up to additive error frac{1}{sqrt{k}} Vert x Vert {2}, approximating Vert x Vert {2} up to a multiplicative (1+ epsilon) factor (resolving a question of Jayaram and Woodruff in PODS 2018), and solving the Point Query and ell {2}-Heavy Hitters Problems. In the random order model, we also obtain new lower bounds for the Point Query and ell {2}-Heavy Hitters Problems. We also give new algorithms complementing our lower bounds and illustrating the tightness of the models we consider, including an algorithm for approximating Vert x Vert {infty} up to additive error frac{1}{sqrt{k}} Vert x Vert {2} in turnstile streams (resolving a question of Cormode in a 2006 IITK Workshop), and an algorithm for finding ell {2}-heavy hitters in randomly ordered insertion streams (which for random order streams, resolves a question of Nelson in a 2018 Warwick Workshop).
AB - Consider the problem of computing the majority of a stream of n i.i.d. uniformly random bits. This problem, known as the coin problem, is central to a number of counting problems in different data stream models. We show that any streaming algorithm for solving this problem with large constant advantage must use Omega(log n) bits of space. We extend our lower bound to proving tight lower bounds for solving multiple, randomly interleaved copies of the coin problem, as well as for solving the OR of multiple copies of a variant of the coin problem. Our proofs involve new measures of information complexity that are well-suited for data streams. We use these lower bounds to obtain a number of new results for data streams. In each case there is an underlying d dimensional vector x with additive updates to its coordinates given in a stream of length m. The input streams arising from our coin lower bound have nice distributional properties, and consequently for many problems for which we only had lower bounds in general turnstile streams, we now obtain the same lower bounds in more natural models, such as the bounded deletion model, in which Vert x Vert {2} never drops by a constant fraction of what it was earlier, or in the random order model, in which the updates are ordered randomly. In particular, in the bounded deletion model, we obtain nearly tight lower bounds for approximating Vert x Vert {infty} up to additive error frac{1}{sqrt{k}} Vert x Vert {2}, approximating Vert x Vert {2} up to a multiplicative (1+ epsilon) factor (resolving a question of Jayaram and Woodruff in PODS 2018), and solving the Point Query and ell {2}-Heavy Hitters Problems. In the random order model, we also obtain new lower bounds for the Point Query and ell {2}-Heavy Hitters Problems. We also give new algorithms complementing our lower bounds and illustrating the tightness of the models we consider, including an algorithm for approximating Vert x Vert {infty} up to additive error frac{1}{sqrt{k}} Vert x Vert {2} in turnstile streams (resolving a question of Cormode in a 2006 IITK Workshop), and an algorithm for finding ell {2}-heavy hitters in randomly ordered insertion streams (which for random order streams, resolves a question of Nelson in a 2018 Warwick Workshop).
KW - bounded deletion model
KW - coin problem
KW - heavy hitters
KW - information cost
KW - majority
KW - space lower bounds
KW - streaming algorithms
UR - http://www.scopus.com/inward/record.url?scp=85100337545&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85100337545&partnerID=8YFLogxK
U2 - 10.1109/FOCS46700.2020.00038
DO - 10.1109/FOCS46700.2020.00038
M3 - Conference contribution
AN - SCOPUS:85100337545
T3 - Proceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS
SP - 318
EP - 329
BT - Proceedings - 2020 IEEE 61st Annual Symposium on Foundations of Computer Science, FOCS 2020
PB - IEEE Computer Society
T2 - 61st IEEE Annual Symposium on Foundations of Computer Science, FOCS 2020
Y2 - 16 November 2020 through 19 November 2020
ER -