Robust moment estimation and improved clustering via sum of squares

Pravesh K. Kothari, Jacob Steinhardt, David Steurer

Research output: Chapter in Book/Report/Conference proceedingConference contribution

98 Scopus citations

Abstract

We develop efficient algorithms for estimating low-degree moments of unknown distributions in the presence of adversarial outliers and design a new family of convex relaxations for k-means clustering based on sum-of-squares method. As an immediate corollary, for any > 0, we obtain an efficient algorithm for learning the means of a mixture of k arbitrary Poincaré distributions in d in time dO(1/γ) so long as the means have separation Ω(kγ). This in particular yields an algorithm for learning Gaussian mixtures with separation Ω(kγ), thus partially resolving an open problem of Regev and Vijayaraghavan (2017). The guarantees of our robust estimation algorithms improve in many cases significantly over the best previous ones, obtained in the recent works. We also show that the guarantees of our algorithms match information-theoretic lower-bounds for the class of distributions we consider. These improved guarantees allow us to give improved algorithms for independent component analysis and learning mixtures of Gaussians in the presence of outliers. We also show a sharp upper bound on the sum-of-squares norms for moment tensors of any distribution that satisfies the Poincaré inequality. The Poincaré inequality is a central inequality in probability theory, and a large class of distributions satisfy it including Gaussians, product distributions, strongly log-concave distributions, and any sum or uniformly continuous transformation of such distributions. As a consequence, this yields that all of the above algorithmic improvements hold for distributions satisfying the Poincaré inequality.

Original languageEnglish (US)
Title of host publicationSTOC 2018 - Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing
EditorsMonika Henzinger, David Kempe, Ilias Diakonikolas
PublisherAssociation for Computing Machinery
Pages182-189
Number of pages8
ISBN (Electronic)9781450355599
DOIs
StatePublished - Jun 20 2018
Event50th Annual ACM Symposium on Theory of Computing, STOC 2018 - Los Angeles, United States
Duration: Jun 25 2018Jun 29 2018

Publication series

NameProceedings of the Annual ACM Symposium on Theory of Computing
ISSN (Print)0737-8017

Other

Other50th Annual ACM Symposium on Theory of Computing, STOC 2018
Country/TerritoryUnited States
CityLos Angeles
Period6/25/186/29/18

All Science Journal Classification (ASJC) codes

  • Software

Keywords

  • Clustering
  • Moments
  • Robust learning
  • Sum-of-squares

Fingerprint

Dive into the research topics of 'Robust moment estimation and improved clustering via sum of squares'. Together they form a unique fingerprint.

Cite this