TY - GEN

T1 - Time-space lower bounds for two-pass learning

AU - Garg, Sumegha

AU - Raz, Ran

AU - Tal, Avishay

N1 - Funding Information:
Ran Raz: Research supported by the Simons Collaboration on Algorithms and Geometry, by a Simons Investigator Award and by the National Science Foundation grant No. CCF-1714779. Avishay Tal: Part of this work was done when the author was a member at the Institute for Advanced Study, Princeton, NJ. Research supported by the Simons Collaboration on Algorithms and Geometry and by the National Science Foundation grant No. CCF-1412958.
Funding Information:
Funding Ran Raz: Research supported by the Simons Collaboration on Algorithms and Geometry, by a Simons Investigator Award and by the National Science Foundation grant No. CCF-1714779. Avishay Tal: Part of this work was done when the author was a member at the Institute for Advanced Study, Princeton, NJ. Research supported by the Simons Collaboration on Algorithms and Geometry and by the National Science Foundation grant No. CCF-1412958.
Publisher Copyright:
© Sumegha Garg, Ran Raz, and Avishay Tal; licensed under Creative Commons License CC-BY 34th Computational Complexity Conference (CCC 2019).

PY - 2019/7/1

Y1 - 2019/7/1

N2 - A line of recent works showed that for a large class of learning problems, any learning algorithm requires either super-linear memory size or a super-polynomial number of samples [11, 7, 12, 9, 2, 5]. For example, any algorithm for learning parities of size n requires either a memory of size Ω(n2) or an exponential number of samples [11]. All these works modeled the learner as a one-pass branching program, allowing only one pass over the stream of samples. In this work, we prove the first memory-samples lower bounds (with a super-linear lower bound on the memory size and super-polynomial lower bound on the number of samples) when the learner is allowed two passes over the stream of samples. For example, we prove that any two-pass algorithm for learning parities of size n requires either a memory of size Ω(n1.5) or at least 2Ω(n) samples. More generally, a matrix M : A × X → {−1, 1} corresponds to the following learning problem: An unknown element x ∈ X is chosen uniformly at random. A learner tries to learn x from a stream of samples, (a1, b1), (a2, b2) . . ., where for every i, ai ∈ A is chosen uniformly at random and bi = M(ai, x). Assume that k, `, r are such that any submatrix of M of at least 2−k · |A| rows and at least 2−` · |X| columns, has a bias of at most 2−r. We show that any two-pass learning algorithm for the learning problem corresponding to M requires either a memory of size at least Ω k · min{k, `}, or at least 2Ω(min{k,`,r}) samples.

AB - A line of recent works showed that for a large class of learning problems, any learning algorithm requires either super-linear memory size or a super-polynomial number of samples [11, 7, 12, 9, 2, 5]. For example, any algorithm for learning parities of size n requires either a memory of size Ω(n2) or an exponential number of samples [11]. All these works modeled the learner as a one-pass branching program, allowing only one pass over the stream of samples. In this work, we prove the first memory-samples lower bounds (with a super-linear lower bound on the memory size and super-polynomial lower bound on the number of samples) when the learner is allowed two passes over the stream of samples. For example, we prove that any two-pass algorithm for learning parities of size n requires either a memory of size Ω(n1.5) or at least 2Ω(n) samples. More generally, a matrix M : A × X → {−1, 1} corresponds to the following learning problem: An unknown element x ∈ X is chosen uniformly at random. A learner tries to learn x from a stream of samples, (a1, b1), (a2, b2) . . ., where for every i, ai ∈ A is chosen uniformly at random and bi = M(ai, x). Assume that k, `, r are such that any submatrix of M of at least 2−k · |A| rows and at least 2−` · |X| columns, has a bias of at most 2−r. We show that any two-pass learning algorithm for the learning problem corresponding to M requires either a memory of size at least Ω k · min{k, `}, or at least 2Ω(min{k,`,r}) samples.

KW - Branching program

KW - Lower bounds

KW - PAC learning

KW - Time-space tradeoffs

KW - Two-pass streaming

UR - http://www.scopus.com/inward/record.url?scp=85070701248&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85070701248&partnerID=8YFLogxK

U2 - 10.4230/LIPIcs.CCC.2019.22

DO - 10.4230/LIPIcs.CCC.2019.22

M3 - Conference contribution

AN - SCOPUS:85070701248

T3 - Leibniz International Proceedings in Informatics, LIPIcs

BT - 34th Computational Complexity Conference, CCC 2019

A2 - Shpilka, Amir

PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing

T2 - 34th Computational Complexity Conference, CCC 2019

Y2 - 18 July 2019 through 20 July 2019

ER -