TY - GEN
T1 - Extractor-Based time-Space lower bounds for learning
AU - Garg, Sumegha
AU - Raz, Ran
AU - Tal, Avishay
N1 - Publisher Copyright:
© 2018 Copyright held by the owner/author(s).
PY - 2018/6/20
Y1 - 2018/6/20
N2 - A matrix M : A× X → {−1, 1} corresponds to the following learning problem: An unknown element x ∈ X is chosen uniformly at random. A learner tries to learn x from a stream of samples, (a1,b1), (a2,b2) . . ., where for every i, ai ∈ A is chosen uniformly at random and bi = M(ai,x). Assume that k, ℓ,r are such that any submatrix of M of at least 2−k ·|A| rows and at least 2−ℓ ·|X| columns, has a bias of at most 2−r. We show that any learning algorithm for the learning problem corresponding to M requires either a memory of size at least Ω (k · ℓ), or at least 2Ω(r) samples. The result holds even if the learner has an exponentially small success probability (of 2−Ω(r)). In particular, this shows that for a large class of learning problems, any learning algorithm requires either a memory of size at least Ω ((log |X|) · (log |A|)) or an exponential number of samples, achieving a tight Ω ((log |X|) · (log |A|)) lower bound on the size of the memory, rather than a bound of Ω min (log |X|)2, (log |A|)2 obtained in previous works by Raz [FOCS’17] and Moshkovitz and Moshkovitz [ITCS’18]. Moreover, our result implies all previous memory-samples lower bounds, as well as a number of new applications. Our proof builds on the work of Raz [FOCS’17] that gave a general technique for proving memory-samples lower bounds.
AB - A matrix M : A× X → {−1, 1} corresponds to the following learning problem: An unknown element x ∈ X is chosen uniformly at random. A learner tries to learn x from a stream of samples, (a1,b1), (a2,b2) . . ., where for every i, ai ∈ A is chosen uniformly at random and bi = M(ai,x). Assume that k, ℓ,r are such that any submatrix of M of at least 2−k ·|A| rows and at least 2−ℓ ·|X| columns, has a bias of at most 2−r. We show that any learning algorithm for the learning problem corresponding to M requires either a memory of size at least Ω (k · ℓ), or at least 2Ω(r) samples. The result holds even if the learner has an exponentially small success probability (of 2−Ω(r)). In particular, this shows that for a large class of learning problems, any learning algorithm requires either a memory of size at least Ω ((log |X|) · (log |A|)) or an exponential number of samples, achieving a tight Ω ((log |X|) · (log |A|)) lower bound on the size of the memory, rather than a bound of Ω min (log |X|)2, (log |A|)2 obtained in previous works by Raz [FOCS’17] and Moshkovitz and Moshkovitz [ITCS’18]. Moreover, our result implies all previous memory-samples lower bounds, as well as a number of new applications. Our proof builds on the work of Raz [FOCS’17] that gave a general technique for proving memory-samples lower bounds.
KW - Branching programs
KW - Extractors
KW - Learning from samples
KW - Time-space tradeoff
UR - http://www.scopus.com/inward/record.url?scp=85049901585&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85049901585&partnerID=8YFLogxK
U2 - 10.1145/3188745.3188962
DO - 10.1145/3188745.3188962
M3 - Conference contribution
AN - SCOPUS:85049901585
T3 - Proceedings of the Annual ACM Symposium on Theory of Computing
SP - 297
EP - 310
BT - STOC 2018 - Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing
A2 - Henzinger, Monika
A2 - Kempe, David
A2 - Diakonikolas, Ilias
PB - Association for Computing Machinery
T2 - 50th Annual ACM Symposium on Theory of Computing, STOC 2018
Y2 - 25 June 2018 through 29 June 2018
ER -