Representation, approximation and learning of submodular functions using low-rank decision trees

Vitaly Feldman, Pravesh Kumar Kothari, Jan Vondrak

Research output: Contribution to journalConference article

9 Scopus citations

Abstract

We study the complexity of approximate representation and learning of submodular functions over the uniform distribution on the Boolean hypercube {0,1}n. Our main result is the following structural theorem: any submodular function is ε-close in l2 to a real-valued decision tree (DT) of depth O (1/ε2). This immediately implies that any submodular function is e-close to a function of at most 2O(1/ε2) variables and has a spectral l1 norm of 2O(1/ε2). It also implies the closest previous result that states that submodular functions can be approximated by polynomials of degree O(1/ε2) (Cheraghchi et al., 2012). Our result is proved by constructing an approximation of a submodular function by a DT of rank 4/ε2 and a proof that any rank-r DT can be ε-approximated by a DT of depth 5/2(r + log(1/ε)). We show that these structural results can be exploited to give an attribute-efficient PAC learning algorithm for submodular functions running in time Õ(n2) · 2O(1/ε4). The best previous algorithm for the problem requires no(1//ε2) time and examples (Cheraghchi et al., 2012) but works also in the agnostic setting. In addition, we give improved learning algorithms for a number of related settings. We also prove that our PAC and agnostic learning algorithms are essentially optimal via two lower bounds: (1) an information-theoretic lower bound of 2 Ω(1/ε2/3) on the complexity of learning monotone submodular functions in any reasonable model (including learning with value queries); (2) computational lower bound of nΩ(1/ε2/3) based on a reduction to learning of sparse parities with noise, widely-believed to be intractable. These are the first lower bounds for learning of submodular functions over the uniform distribution.

Original languageEnglish (US)
Pages (from-to)711-740
Number of pages30
JournalJournal of Machine Learning Research
Volume30
StatePublished - Jan 1 2013
Event26th Conference on Learning Theory, COLT 2013 - Princeton, NJ, United States
Duration: Jun 12 2013Jun 14 2013

All Science Journal Classification (ASJC) codes

  • Software
  • Control and Systems Engineering
  • Statistics and Probability
  • Artificial Intelligence

Keywords

  • Decision tree
  • Learning
  • Submodular function
  • Uniform distribution

Fingerprint Dive into the research topics of 'Representation, approximation and learning of submodular functions using low-rank decision trees'. Together they form a unique fingerprint.

  • Cite this